# Migration Guide: 1.x to 2.x `PyPDF2<2.0.0` ([docs](https://pypdf2.readthedocs.io/en/1.27.12/meta/history.html)) is very different from `PyPDF2>=2.0.0` ([docs](../meta/history.md)). Luckily, most changes are simple naming adjustments. This guide helps you to make the step from `PyPDF2 1.x` (or even the original PyPdf) to `PyPDF2>=2.0.0`. You can execute your code with the updated version and show deprecation warnings by running `python -W all your_code.py`. # Imports and Modules * `PyPDF2.utils` no longer exists * `PyPDF2.pdf` no longer exists. You can import from `PyPDF2` directly or from `PyPDF2.generic` # Naming Adjustments ## Classes The base classes were renamed as they also allow operating with BytesIO streams instead of files. Also, the `strict` parameter changed the default value from `strict=True` to `strict=False`. * `PdfFileReader` ➔ `PdfReader` * `PdfFileWriter` ➔ `PdfWriter` * `PdfFileMerger` ➔ `PdfMerger` PdfFileReader and PdfFileMerger no longer have the `overwriteWarnings` parameter. The new behavior is `overwriteWarnings=False`. ## Function, Method, and Property Names In `PyPDF2.xmp.XmpInformation`: * `rdfRoot` ➔ `rdf_root` * `xmp_createDate` ➔ `xmp_create_date` * `xmp_creatorTool` ➔ `xmp_creator_tool` * `xmp_metadataDate` ➔ `xmp_metadata_date` * `xmp_modifyDate` ➔ `xmp_modify_date` * `xmpMetadata` ➔ `xmp_metadata` * `xmpmm_documentId` ➔ `xmpmm_document_id` * `xmpmm_instanceId` ➔ `xmpmm_instance_id` In `PyPDF2.generic`: * `readObject` ➔ `read_object` * `convertToInt` ➔ `convert_to_int` * `DocumentInformation.getText` ➔ `DocumentInformation._get_text` : This method should typically not be used; please let me know if you need it. * `readHexStringFromStream` ➔ `read_hex_string_from_stream` * `initializeFromDictionary` ➔ `initialize_from_dictionary` * `createStringObject` ➔ `create_string_object` * `TreeObject.hasChildren` ➔ `TreeObject.has_children` * `TreeObject.emptyTree` ➔ `TreeObject.empty_tree` In many places: - `getObject` ➔ `get_object` - `writeToStream` ➔ `write_to_stream` - `readFromStream` ➔ `read_from_stream` PdfReader class: - `reader.getPage(pageNumber)` ➔ `reader.pages[page_number]` - `reader.getNumPages()` / `reader.numPages` ➔ `len(reader.pages)` - `getDocumentInfo` ➔ `metadata` - `flattenedPages` attribute ➔ `flattened_pages` - `resolvedObjects` attribute ➔ `resolved_objects` - `xrefIndex` attribute ➔ `xref_index` - `getNamedDestinations` / `namedDestinations` attribute ➔ `named_destinations` - `getPageLayout` / `pageLayout` ➔ `page_layout` attribute - `getPageMode` / `pageMode` ➔ `page_mode` attribute - `getIsEncrypted` / `isEncrypted` ➔ `is_encrypted` attribute - `getOutlines` ➔ `get_outlines` - `readObjectHeader` ➔ `read_object_header` - `cacheGetIndirectObject` ➔ `cache_get_indirect_object` - `cacheIndirectObject` ➔ `cache_indirect_object` - `getDestinationPageNumber` ➔ `get_destination_page_number` - `readNextEndLine` ➔ `read_next_end_line` - `_zeroXref` ➔ `_zero_xref` - `_authenticateUserPassword` ➔ `_authenticate_user_password` - `_pageId2Num` attribute ➔ `_page_id2num` - `_buildDestination` ➔ `_build_destination` - `_buildOutline` ➔ `_build_outline` - `_getPageNumberByIndirect(indirectRef)` ➔ `_get_page_number_by_indirect(indirect_ref)` - `_getObjectFromStream` ➔ `_get_object_from_stream` - `_decryptObject` ➔ `_decrypt_object` - `_flatten(..., indirectRef)` ➔ `_flatten(..., indirect_ref)` - `_buildField` ➔ `_build_field` - `_checkKids` ➔ `_check_kids` - `_writeField` ➔ `_write_field` - `_write_field(..., fieldAttributes)` ➔ `_write_field(..., field_attributes)` - `_read_xref_subsections(..., getEntry, ...)` ➔ `_read_xref_subsections(..., get_entry, ...)` PdfWriter class: - `writer.getPage(pageNumber)` ➔ `writer.pages[page_number]` - `writer.getNumPages()` ➔ `len(writer.pages)` - `addMetadata` ➔ `add_metadata` - `addPage` ➔ `add_page` - `addBlankPage` ➔ `add_blank_page` - `addAttachment(fname, fdata)` ➔ `add_attachment(filename, data)` - `insertPage` ➔ `insert_page` - `insertBlankPage` ➔ `insert_blank_page` - `appendPagesFromReader` ➔ `append_pages_from_reader` - `updatePageFormFieldValues` ➔ `update_page_form_field_values` - `cloneReaderDocumentRoot` ➔ `clone_reader_document_root` - `cloneDocumentFromReader` ➔ `clone_document_from_reader` - `getReference` ➔ `get_reference` - `getOutlineRoot` ➔ `get_outline_root` - `getNamedDestRoot` ➔ `get_named_dest_root` - `addBookmarkDestination` ➔ `add_bookmark_destination` - `addBookmarkDict` ➔ `add_bookmark_dict` - `addBookmark` ➔ `add_bookmark` - `addNamedDestinationObject` ➔ `add_named_destination_object` - `addNamedDestination` ➔ `add_named_destination` - `removeLinks` ➔ `remove_links` - `removeImages(ignoreByteStringObject)` ➔ `remove_images(ignore_byte_string_object)` - `removeText(ignoreByteStringObject)` ➔ `remove_text(ignore_byte_string_object)` - `addURI` ➔ `add_uri` - `addLink` ➔ `add_link` - `getPage(pageNumber)` ➔ `get_page(page_number)` - `getPageLayout / setPageLayout / pageLayout` ➔ `page_layout attribute` - `getPageMode / setPageMode / pageMode` ➔ `page_mode attribute` - `_addObject` ➔ `_add_object` - `_addPage` ➔ `_add_page` - `_sweepIndirectReferences` ➔ `_sweep_indirect_references` PdfMerger class - `__init__` parameter: `strict=True` ➔ `strict=False` (the `PdfFileMerger` still has the old default) - `addMetadata` ➔ `add_metadata` - `addNamedDestination` ➔ `add_named_destination` - `setPageLayout` ➔ `set_page_layout` - `setPageMode` ➔ `set_page_mode` Page class: - `artBox` / `bleedBox` / `cropBox` / `mediaBox` / `trimBox` ➔ `artbox` / `bleedbox` / `cropbox` / `mediabox` / `trimbox` - `getWidth`, `getHeight ` ➔ `width` / `height` - `getLowerLeft_x` / `getUpperLeft_x` ➔ `left` - `getUpperRight_x` / `getLowerRight_x` ➔ `right` - `getLowerLeft_y` / `getLowerRight_y` ➔ `bottom` - `getUpperRight_y` / `getUpperLeft_y` ➔ `top` - `getLowerLeft` / `setLowerLeft` ➔ `lower_left` property - `upperRight` ➔ `upper_right` - `mergePage` ➔ `merge_page` - `rotateClockwise` / `rotateCounterClockwise` ➔ `rotate_clockwise` - `_mergeResources` ➔ `_merge_resources` - `_contentStreamRename` ➔ `_content_stream_rename` - `_pushPopGS` ➔ `_push_pop_gs` - `_addTransformationMatrix` ➔ `_add_transformation_matrix` - `_mergePage` ➔ `_merge_page` XmpInformation class: - `getElement(..., aboutUri, ...)` ➔ `get_element(..., about_uri, ...)` - `getNodesInNamespace(..., aboutUri, ...)` ➔ `get_nodes_in_namespace(..., aboutUri, ...)` - `_getText` ➔ `_get_text` utils.py: - `matrixMultiply` ➔ `matrix_multiply - `RC4_encrypt` is moved to the security module ## Parameter Names * `PdfWriter.get_page`: `pageNumber` ➔ `page_number` * `PyPDF2.filters` (all classes): `decodeParms` ➔ `decode_parms` * `PyPDF2.filters` (all classes): `decodeStreamData` ➔ `decode_stream_data` * `pagenum` ➔ `page_number` * `PdfMerger.merge`: `position` ➔ `page_number` * `PdfWriter.add_outline_item_destination`: `dest` ➔ `page_destination` * `PdfWriter.add_named_destination_object`: `dest` ➔ `page_destination` * `PdfWriter.encrypt`: `user_pwd` ➔ `user_password` * `PdfWriter.encrypt`: `owner_pwd` ➔ `owner_password` ## Deprecations A few classes / functions were deprecated without replacement: * `PyPDF2.utils.ConvertFunctionsToVirtualList` * `PyPDF2.utils.formatWarning` * `PyPDF2.isInt(obj)`: Use `instance(obj, int)` instead * `PyPDF2.u_(s)`: Use `s` directly * `PyPDF2.chr_(c)`: Use `chr(c)` instead * `PyPDF2.barray(b)`: Use `bytearray(b)` instead * `PyPDF2.isBytes(b)`: Use `instance(b, type(bytes()))` instead * `PyPDF2.xrange_fn`: Use `range` instead * `PyPDF2.string_type`: Use `str` instead * `PyPDF2.isString(s)`: Use `instance(s, str)` instead * `PyPDF2._basestring`: Use `str` instead * `b_(...)` was removed. You should typically be able to use the bytes object directly, otherwise you can [copy this](https://github.com/py-pdf/PyPDF2/pull/986#issuecomment-1230698069)