Displaying PDFs lazily with Vue
Building a PDF Viewer with Vue - Part 2
As we demonstrated in the previous post, we can render pages of a PDF to <canvas>
elements using PDF.js and Vue. We were able to use a simple Vue component hierarchy to separate the responsibilities of data fetching and page rendering. We used the PDF.js library to fetch the page data and hand off the work of drawing the data onto <canvas>
elements.
In this post, we'll add a new requirement: we should only render pages when they are visible, i.e., as they are scrolled into the viewport. Previously, we were rendering all pages eagerly, regardless of whether they were appearing in the client browser. For a large PDF, this could mean valuable resources are used to render many pages offscreen and may never be viewed. Let's see how we can fix that using Vue.
The latest source code for this project is on Github at rossta/vue-pdfjs-demo. To see the version of the project described in this post, checkout the part-2-scrolling
branch. Here's the project demo:
Adding scroll behavior
To review, once a <PDFPage>
component mounts, it calls the page.render
method to draw the PDF data to the <canvas>
element. To defer page rendering, this method should only be called once the <canvas>
element has become visible in the scroll window of the document. We'll detect visibility of the page by inferring from the scroll boundaries or the parent component, <PDFDocument>
along with the position and dimensions of the child <PDFPage>
components.
First, a CSS change to make our document scrollable within a relatively positioned parent element.
.pdf-document {
position: absolute;
overflow: auto;
width: 100%;
top: 0;
bottom: 0;
left: 0;
right: 0;
}
The <PDFDocument>
will track its visible boundaries using the scrollTop
and clientHeight
properties of its element. We'll record these boundaries when the component mounts.
// src/components/PDFDocument.vue
data() {
return {
scrollTop: 0,
clientHeight: 0,
// ...
};
},
methods: {
updateScrollBounds() {
const {scrollTop, clientHeight} = this.$el;
this.scrollTop = scrollTop;
this.clientHeight = clientHeight;
},
// ...
},
mounted() {
this.updateScrollBounds();
},
// ...
The scrollTop
according to MDN:
An element's
scrollTop
value is a measurement of the distance from the element's top to its topmost visible content.
The clientHeight
according to MDN:
The
clientHeight
read-only property is zero for elements with no CSS or inline layout boxes, otherwise it's the inner height of an element in pixels, including padding but not the horizontal scrollbar height, border, or margin.
Used together, we can determine what portion of the document is visible to the user.
Detecting page visibility
The <PDFPage>
component will track the boundaries of its underlying canvas element, whose dimensions we demonstrated how to calculate in the previous post. As with the document component, we'll trigger the update of this data property when the page component mounts:
// src/components/PDFPage.vue
data() {
return {
elementTop: 0,
elementHeight: 0,
// ...
};
},
methods: {
updateElementBounds() {
const {offsetTop, offsetHeight} = this.$el;
this.elementTop = offsetTop;
this.elementHeight = offsetHeight;
},
// ...
},
mounted() {
this.updateElementBounds();
},
// ...
The element's offsetTop
property will represent the distance from its top boundary to that of the containing document element div
. Recording its offsetHeight
enables us to determine how far the bottom of the element is from the top of the container.
Note that the updateElementBounds
and updateScrollBounds
methods are necessary because properties of DOM elements are outside of Vue's control, i.e., they are not reactive. These methods exist to maintain reactive copies of these properties in Vue and we must trigger them somehow when scrolling or resizing the window so that the changes will propagate.
Since we can pass the scroll data of the parent component to the child page components as props, we now have what we need to determine if a given page is visible in the scroll area of the document.
// src/components/PDFPage.vue
props: {
scrollTop: {
type: Number,
default: 0
},
clientHeight: {
type: Number,
default: 0
},
// ...
},
computed: {
isElementVisible() {
const {elementTop, elementBottom, scrollTop, scrollBottom} = this;
if (!elementBottom) return;
return elementTop < scrollBottom && elementBottom > scrollTop;
},
elementBottom() {
return this.elementTop + this.elementHeight;
},
scrollBottom() {
return this.scrollTop + this.clientHeight;
},
// ...
},
// ...
We'll use a computed property isElementVisible
which will update whenever either the scrollBounds
or elementBounds
change. It will simply check if the top of the element is above the bottom of the scroll area (top < scrollBottom
) and the bottom of the element is below the top of the scroll area (bottom > scrollTop
). Note that the y
dimension increases moving down the screen.
For another approach to detecting visibility in Vue, checkout the Akryum/vue-observe-visibility on Github, which is also available as an NPM package.
Lazy rendering pages
Previously, we called the drawPage
method (described in the previous post) when the page component mounted. To make the page render lazily, now we call the method only when the element becomes visible, using a watcher.
// src/components/PDFPage.vue
watch: {
isElementVisible(isElementVisible) {
if (isElementVisible) this.drawPage();
},
// ...
},
// ...
We've defined drawPage
such that it will only render once if called multiple times.
In the page components, we can simply watch for changes in scroll boundaries and scale—changes to these props may cause a previously "hidden" page to become visible in the browser.
// src/components/PDFPage.vue
watch: {
scale: 'updateElementBounds',
scrollTop: 'updateElementBounds',
clientHeight: 'updateElementBounds',
// ...
},
// ...
For the document component, we add listeners to DOM events to trigger the updateScrollBounds
method within the mounted
hook.
// src/components/PDFDocument.vue
import throttle from 'lodash/throttle';
export default {
// ...
mounted() {
this.updateScrollBounds();
const throttledCallback = throttle(this.updateScrollBounds, 300);
this.$el.addEventListener('scroll', throttledCallback, true);
window.addEventListener('resize', throttledCallback, true);
this.throttledOnResize = throttledCallback;
},
beforeDestroy() {
window.removeEventListener('resize', this.throttledOnResize, true);
},
// ...
A few notes about the implementation above: we use lodash's throttle
function to ensure our callback is only triggered once every 300ms; otherwise, we'd be making this update potentially dozens of times a second, which for our purposes is unnecessary and could potentially be a performance bottleneck. Since we can attach our throttledCallback
to the 'scroll'
event listener of this.$el
, we will also be cleaned up nicely during Vue teardown phase. However, since the 'resize'
event will currently only work on the window
, we'll need to store a reference to the throttled callback as this.throttledOnResize
so we can remove the event listener in Vue's beforeDestroy
hook.
For a great explanation of throttling (and its cousin, debouncing) event callbacks, check out this post on CSS tricks.
Adding "infinite" scrolling
So far we have deferred rendering individual pages to mounted canvas elements until scrolled into view. This allows us to spare CPU cycles at the cost of the brief visual delay as newly visible pages are drawn. However, we are still creating the <PDFPage>
components for every PDF page, regardless of whether they are visible. This results in n - visible
blank <canvas>
elements below the fold.
We can go one step further. Instead of fetching all the pages up front, we'll fetch pages in batches as the user scrolls to the bottom of the document. In other words, we'll implement "infinite scrolling" for PDF pages (though most PDFs of which I'm aware are finite in length). Fetching in batches is a compromise between eagerly loading all pages and fetching one at a time.
To keep things simple for this tutorial, we'll add batching directly to the <PDFDocument>
component; in a future post, we'll extract this information to other parts of our application.
Batched fetching
Recall in our document component, we're tracking a pdf
property and an array of pages
. We now add a cursor
to represent the highest page number in the document we've attempted to fetch. We also will track the expected pageCount
using a property provided by the pdf
object.
// src/components/PDFDocument.vue
data() {
return {
pdf: undefined,
pages: [],
cursor: 0,
// ...
};
},
computed: {
pageCount() {
return this.pdf ? this.pdf.numPages : 0;
},
// ...
},
// ...
We also previously added a watcher for the pdf
property to fetch all pages:
// src/components/PDFDocument.vue
watch: {
pdf(pdf) {
this.pages = [];
const promises = range(1, pdf.numPages).
map(number => pdf.getPage(number));
Promise.all(promises).
then(pages => (this.pages = pages));
},
// ...
},
// ...
We'll modify this watcher by extracting a method to fetch pages in batches:
// src/components/PDFDocument.vue
watch: {
pdf(pdf) {
this.pages = [];
this.fetchPages();
},
// ...
},
// ...
Here is our new fetchPages
implementation:
// src/components/PDFDocument.vue
const BATCH_COUNT = 10;
export default {
// ...
methods: {
fetchPages() {
if (!this.pdf) return;
const currentCount = this.pages.length;
if (this.pageCount > 0 && currentCount === this.pageCount) return;
if (this.cursor > currentCount) return;
const startPage = currentCount + 1; // PDF page numbering starts at 1
const endPage = Math.min(currentCount + BATCH_COUNT, this.pageCount);
this.cursor = endPage;
getPages(this.pdf, startPage, endPage)
.then((pages) => {
this.pages.splice(currentCount, 0, ...pages);
return this.pages;
})
.catch((response) => {
this.$emit('document-errored');
});
},
// ...
}
// ...
The added complexity in fetchPages
allows us to request small batches of pages with each subsequent call. The currentCount
represents the total number of pages that have already been fetched. The startPage
is simply the next page number of the next would-be page in the array, and the endPage
of the batch is the lesser of an arbitrarily small batch of pages (BATCH_COUNT
) and the remaining pages. We're able to insert these pages in the correct location in the tracked pages array with this.pages.splice(currentCount, 0, ...pages)
. We also use the this.cursor
property to track the most recently request endPage
to ensure the same batch is only requested once.
Why splice?
You may ask, why not simply add the new pages on to the end of the this.pages
array instead? You could imagine using an expression like this.pages.push.apply(this.pages, pages)
to modify the array in place or replacing the array altogether with this.pages = [...this.pages, ...pages]
or concat
. The reason is that getPages
is asynchronous—it returns a promise that fulfills when all pages in the batch have been fetched. It is safer to assume this method can be called in rapid succession where multiple batch requests may be in flight simultaneously. Using splice
to add new pages at the expected position will ensure our batches are inserted into the this.pages
array in the correct order.
Finding the bottom
To determine whether the user has scrolled to the bottom of the last of the fetched pages, we will again lean on properties of this.$el
. We can ask if the sum of the scrollTop
of the document and its visible height, clientHeight
, has equalled its total scrollHeight
.
// src/components/PDFDocument.vue
methods: {
isBottomVisible() {
const {scrollTop, clientHeight, scrollHeight} = this.$el;
return scrollTop + clientHeight >= scrollHeight;
},
// ...
},
// ...
We'll call this method during updateScrollBounds
method and record a tracked a true/false property, didReachBottom
.
// src/components/PDFDocument.vue
data() {
return {
didReachBottom: false,
// ...
};
},
methods: {
updateScrollBounds() {
const {scrollTop, clientHeight} = this.$el;
this.scrollTop = scrollTop;
this.clientHeight = clientHeight;
this.didReachBottom = this.isBottomVisible();
},
// ...
},
// ...
We can then use a watcher to call fetchPages
if this property flips from false
to true
. This watcher would fire continuously in a cycle as the user scrolls to the bottom and more pages are fetched.
// src/components/PDFDocument.vue
watch: {
didReachBottom(didReachBottom) {
if (didReachBottom) this.fetchPages();
},
// ...
},
// ...
For another in-depth look at adding infinite scrolling for Vue, check out Chris Nwamba's post on Scotch.io. There are also a number of packages that abstract infinite scrolling if you'd prefer to lean on open source, including Akryum/vue-virtual-scroller and ElemeFE/vue-infinite-scroll.
Wrapping up
We've succeeded in making our documents more lazy; now we can defer both data fetching and page rendering until necessary, potentially improving performance of the initial page load and avoiding waste, especially for large documents.
We've been adding quite a bit of complexity though to our existing <PDFDocument>
and <PDFPage>
components; they now both are responsible for making API requests, calculating element boundaries, lazy behavior, etc. Ideally, we'll want to limit the responsibility of a given component to make our application less resistant to change. In the next post, we'll refactor our PDF viewer to to separate out data fetching and scrolling behavior into separate "renderless components". These changes will subsequently allow us to share code and add a new feature: a preview pane.
And now you've reached the bottom of this post!