Out-of-the-box PDF rendering in Windows Apps

One of the less well known features of UWP is encapsulated in the Windows.Data.Pdf namespace: This includes classes for reading PDF documents and rendering them. On first sight, this seems hardly useful in any real-life app, since

  • it doesn’t support PDF generation, and
  • it only renders PDF documents as images.

Especially the second statement limits real-life use cases, since while many apps need to display PDF file they also require rendering them as vector graphics (thus allowing e.g. text selection), full-text search etc., which is why third party PDF frameworks must be referenced anyway.

However, there are certain scenarios which require only basic PDF rendering functionality. In these cases, the built-in PDF classes may come in handy to avoid including (and probably purchasing) third party libraries, so it is a good idea to know the principles of accessing the Windows.Data.Pdf helper classes.

To illustrate such a small-scale example, I’d like to build upon the simple file preview list I presented a few weeks ago – if you haven’t read through this blog article, you might want to at least have a look at the code snippets to get an idea of what we’re building on (don’t worry, it’s not that complicated – never more than a few lines of XAML code actually).

The basic idea is: If we have a list of files, that displays thumbnail previews for image files and the (textual) file name for all other files, shouldn’t it be possible to render a small preview of PDF files as well? Fortunately, the Windows.Data.Pdf namespace gives us all we need!

Obviously, we first need to adapt the existing code-base by a third DataTemplate (no problem thanks to the DataTemplateSelector!):

	public class FileItemTemplateSelector : DataTemplateSelector
	{
		protected override DataTemplate SelectTemplateCore(object item, DependencyObject container)
		{
			var file = item as FileObject;
			switch (file.MimeType)
			{
				case "image/png":
				case "image/jpg":
					return ImageFileItemTemplate;
				case "application/pdf":
					return PdfFileItemTemplate;
				default:
					return GeneralFileItemTemplate;
			}
		}

		public DataTemplate GeneralFileItemTemplate { get; set; }
		public DataTemplate ImageFileItemTemplate { get; set; }
		public DataTemplate PdfFileItemTemplate { get; set; }
	}
<Page.Resource>
    <DataTemplate x:Key="GeneralFileItemTemplate">
        <TextBlock Text="{Binding FileName}"/>
    </DataTemplate>
	
    <DataTemplate x:Key="ImageFileItemTemplate">
        <Image Source="{Binding FilePath}"/>
    </DataTemplate>
	
    <DataTemplate x:Key="PdfFileItemTemplate">
		<controls:PdfViewer x:Name="PdfViewer" Source="{Binding FilePath}"/>
		<Image Visibility="{Binding ElementName=PdfViewer, Path=LoadError, Converter={StaticResource BooleanToVisibilityConverter}}" Source="Assets/PdfIcon.png"/>
    </DataTemplate>
	
    <helpers:FileItemTemplateSelector x:Key="FileItemTemplateSelector" GeneralFileItemTemplate="{StaticResource GeneralFileItemTemplate}" ImageFileItemTemplate="{StaticResource ImageFileItemTemplate}" PdfFileItemTemplate="{StaticResource PdfFileItemTemplate}"/>

</Page.Resource>

The template selection logic should be rather self-explanatory. The DataTemplate for PDF files itself contains two controls: One custom-made PdfViewer (this is the actual core content of this blog post, we’ll come to it in a moment!), and an image that displays a generic PDF icon (which is only visible if the PDF rendering is not successful) as fallback.

Alright, what about this PdfViewer component we referenced in XAML but didn’t yet talk about? Basically, it is a custom user control – to start off, we simply create a new class called PdfViewer that is derived from Windows.UI.Xaml.Controls.Grid (which, in contrast to inheriting from the basic UserControl class, allows us to easily position child elements, so that we only need to take care of the core PDF rendering functionality). In addition, our PdfViewer class will get two dependency properties that can be used from within XAML, and one property that will contain the loading and rendering logic:

public class PdfViewer : Grid
{
	public static readonly DependencyProperty SourceProperty = DependencyProperty.Register(
		"Source", typeof(Uri), typeof(PdfViewer), new PropertyMetadata(default(Uri), SourceChangedCallback));

	public Uri Source
	{
		get { return (Uri) GetValue(SourceProperty); }
		set { SetValue(SourceProperty, value); }
	}

	private static async void SourceChangedCallback(DependencyObject sender, DependencyPropertyChangedEventArgs args)
	{
		var control = sender as PdfViewer;
		var source = args.NewValue as Uri;

		await control.LoadDocument(source);
	}

	public static readonly DependencyProperty LoadErrorProperty = DependencyProperty.Register(
		"LoadError", typeof(bool), typeof(PdfViewer), new PropertyMetadata(default(bool)));

	public bool LoadError
	{
		get { return (bool) GetValue(LoadErrorProperty); }
		set { SetValue(LoadErrorProperty, value); }
	}

	private async Task LoadDocument(Uri source)
	{
		// TODO
	}
}

Although this is just the skeleton that will soon be filled with action, let me explain the parts our user control consists of:

  • The Source property is the main connection point to the viewer’s environment. It expects the PDF file’s full path to be passed either explicitly in C# code, or through binding from within XAML. Whenever the Source property’s content changes (again, either by explicitly setting its content or through binding), the SourceChangedCallback method is invoked, which in turn calls the LoadDocument method.
  • The LoadError property is to be read only. If anything goes wrong during PDF loading and rendering, it will be set to true, leading the fallback PDF icon image to be displayed on top of the (then empty) PdfViewer. In a production environment, you might want to configure this as string property and include the actual error to be displayed to the user.
  • The LoadDocument method is called internally whenever the Source property changes (see above). It will take care of reading the file, loading its PDF content, rendering them and displaying the resulting image.

The missing link is the LoadDocument method’s content. As mentioned above, it needs to perform the following steps:

  1. Open the file – this is done using the common UWP StorageFile class
  2. Read the file’s content as PDF document – here, we’ll be using the Windows.Data.Pdf.PdfDocument class for the first time (by the way, it contains several methods for reading PDF content from either files or streams, even for PDF documents that are password protected)
  3. In case one of these two steps fails, set the LoadError flag and skip the rendering
  4. Open the PDF document’s first page (we won’t need the others, since all we want is a preview of the PDF contents) – the PdfDocument.GetPage() method returns a PdfPage object, which in turn offers a RenderToStreamAsync method that converts the PDF content to a plain image
  5. Create an Image control based on that stream, and setting it as the component’s (remember it inherits from Grid!) content

With these steps in mind, the following code should not surprise you:

private async Task LoadDocument(Uri source)
{
	Children.Clear(); // Empty the PDF viewer first

	PdfDocument document;
	try
	{
		StorageFile file = await StorageFile.GetFileFromApplicationUriAsync(source);
		document = await PdfDocument.LoadFromFileAsync(file);
	}
	catch (Exception e)
	{
		// TODO: Log exception
		Children.Clear();
		LoadError = true;
		return;
	}

	using (IRandomAccessStream stream = new MemoryStream().AsRandomAccessStream())
	{
		var pdfPage = document.GetPage(0); // Only load the first page
		await pdfPage.RenderToStreamAsync(stream);
		BitmapImage bmp = new BitmapImage();
		bmp.SetSource(stream);
		Image img = new Image();
		img.Source = bmp;
		Children.Add(img);
	}

	LoadError = false;
}

That’s it, our basic sample should compile and run successfully. However, there are two more things I’d like to point out performance-wise:

  • Check out the PdfPage.RenderToStreamAsync method and note that it has an overload that accepts PdfPageRenderOptions, which in turn allow specifying the desired width and height of the target image – especially when creating only small preview thumbnail images as in our sample, this can be useful to improve performance by avoiding unnecessarily detailed rendering! However, make sure to set these values so that the PDF page’s original aspect ratio is maintained, otherwise the thumbnail image will be cropped.
  • I tried to keep the code snippets shown as simple as possible, so our sample load the PDF document and renders the thumbnail image from scratch each time a list item corresponding to a PDF file is to be displayed. For files that are displayed frequently, it might pay off to do the rendering only once and save the resulting thumbnail to an image file for re-use.