From charlesreid1

Page Objects

These represent MediaWiki pages, obviously.

Many of Site object's methods return Pages or PageGenerators (see Pywikibot/Sites).

To get a site to begin with, you need a user-config.py (see Pywikibot/Setup). Then you should run this from Python, in the directory where your user config file is located:

import pywikibot
s = pywikibot.Site()

Revision Objects

Each page has an edit history, consisting of different versions of the page at different points in time. Revision objects represent atomic information about a given revision of a given page.

Useful Actions

To build a graph of links on a given wiki, there are a couple of methods that are useful:

  • page.backlinks - lists pages that link to the given page
  • page.linkedPages - lists pages on this wiki that this page links to
  • page.extlinks - lists external targets that this page links to
  • page.getReferences - similar to backlinks, but includes transclusions too

backlinks

In [43]: page.backlinks
Out[43]: <bound method deprecated_args.<locals>.decorator.<locals>.wrapper of Page('AOCP/Binomial Coefficients')>

In [44]: page.backlinks()
Out[44]: <itertools.chain at 0x10838c828>

In [45]: list(page.backlinks())
Out[45]:
[Page('Flags'),
 Page('Algorithms/Combinatorics and Heuristics'),
 Page('Algorithms/Combinatorics'),
 Page('AOCP/Multisets'),
 Page('AOCP/Permutations'),
 Page('Template:AOCPFlag'),
 Page('AOCP/Multinomial Coefficients'),
 Page('AOCP/Harmonic Numbers'),
 Page('AOCP/Fibonacci Numbers'),
 Page('ACOP/Generating Functions'),
 Page('AOCP'),
 Page('AOCP/Generating Functions'),
 Page('Generating Functions'),
 Page('AOCP/Combinatorics'),
 Page('Cards'),
 Page('Binomial Coefficients'),
 Page('AOCP/Generating Permutations and Tuples'),
 Page('Letter Coverage'),
 Page('Five Letter Words')]

In [46]:

linkedPages

Asking for linkedPages() will return all pages that the current page contains links TO. This method returns a PageGenerator object, similar to the site's allpages() method. As before, we pass that into a list() method to return each item from the generator and construct a list from the results.

In [46]: list(page.linkedPages())
Out[46]:
[Page('AOCP/Boolean Functions'),
 Page('AOCP/Combinatorial Algorithms'),
 Page('AOCP/Infinite Series'),
 Page('Algorithm Analysis/Randomized Quick Sort'),
 Page('Algorithm Analysis/Substring Pattern Matching'),
 Page('ACOP/Generating Functions'),
 Page('AOCP/Combinatorics'),
 Page('AOCP/Fibonacci Numbers'),
 Page('AOCP/Five Letter Words'),
 Page('AOCP/Generating Permutations and Tuples'),
 Page('AOCP/Harmonic Numbers'),
 Page('AOCP/Multinomial Coefficients'),
 Page('AOCP/Multisets'),
 Page('Algorithm Analysis/Matrix Multiplication'),
 Page('Algorithm Analysis/Merge Sort'),
 Page('Algorithm complexity'),
 Page('Algorithmic Analysis of Sort Functions'),
 Page('Algorithms'),
 Page('Algorithms/Combinatorics'),
 Page('Algorithms/Combinatorics and Heuristics'),
 Page('Algorithms/Data Structures'),
 Page('Algorithms/Graphs'),
 Page('Algorithms/Optimization'),
 Page('Algorithms/Search'),
 Page('Algorithms/Sort'),
 Page('Algorithms/Strings'),
 Page('Amortization'),
 Page('Amortization/Accounting Method'),
 Page('Binary Search'),
 Page('Binary Search Modifications'),
 Page('CS'),
 Page('Cards'),
 Page('Divide and Conquer'),
 Page('Divide and Conquer/Master Theorem'),
 Page('Estimation'),
 Page('Estimation/BitsAndBytes'),
 Page('Five Letter Words'),
 Page('Flags'),
 Page('Heap Sort'),
 Page('Letter Coverage'),
 Page('Merge Sort'),
 Page('Project Euler'),
 Page('Quick Sort'),
 Page('Rubiks Cube/Permutations'),
 Page('Rubiks Cube/Tuples'),
 Page('Skiena Chapter 4 Questions'),
 Page('Theta vs Big O'),
 Page('Template:AOCPFlag'),
 Page('Template:AlgorithmsFlag'),
 Category('Category:AOCP')]

In [47]: type(page.linkedPages())
Out[47]: pywikibot.data.api.PageGenerator

extlinks

Asking for the external links on a given page will return a plain generator:

In [48]: type(page.extlinks())
Out[48]: generator

In [49]: list(page.extlinks())
Out[49]:
['http://charlesreid1.com/w/index.php?title=Template:AOCPFlag&action=edit',
 'http://charlesreid1.com/w/index.php?title=Template:AlgorithmsFlag&action=edit',
 'https://charlesreid1.com:3000/cs/study-plan']

getReferences

Not sure how this is different from backlinks, but it is almost entirely the same (only one item is in backlinks but not in getReferences).

In [50]: list(page.getReferences())
Out[50]:
[Page('Flags'),
 Page('Algorithms/Combinatorics and Heuristics'),
 Page('Algorithms/Combinatorics'),
 Page('AOCP/Multisets'),
 Page('AOCP/Permutations'),
 Page('Template:AOCPFlag'),
 Page('AOCP/Multinomial Coefficients'),
 Page('AOCP/Harmonic Numbers'),
 Page('AOCP/Fibonacci Numbers'),
 Page('ACOP/Generating Functions'),
 Page('AOCP'),
 Page('AOCP/Generating Functions'),
 Page('Generating Functions'),
 Page('AOCP/Combinatorics'),
 Page('Cards'),
 Page('AOCP/Generating Permutations and Tuples'),
 Page('Letter Coverage'),
 Page('Five Letter Words')]

In [51]: type(page.getReferences())
Out[51]: itertools.islice

In [52]: type(page.backlinks())
Out[52]: itertools.chain

If we ask for some help, we can see the difference between these two methods:

In [54]: help(page.getReferences)
Help on method getReferences in module pywikibot.page:

getReferences(follow_redirects=True, withTemplateInclusion=True, onlyTemplateInclusion=False, redirectsOnly=False, namespaces=None, total=None, content=False, step=NotImplemented) method of pywikibot.page.Page instance
    Return an iterator all pages that refer to or embed the page.

    If you need a full list of referring pages, use
    C{pages = list(s.getReferences())}

    @param follow_redirects: if True, also iterate pages that link to a
        redirect pointing to the page.
    @param withTemplateInclusion: if True, also iterate pages where self
        is used as a template.
    @param onlyTemplateInclusion: if True, only iterate pages where self
        is used as a template.
    @param redirectsOnly: if True, only iterate redirects to self.
    @param namespaces: only iterate pages in these namespaces
    @param total: iterate no more than this number of pages in total
    @param content: if True, retrieve the content of the current version
        of each referring page (default False)


In [55]: help(page.backlinks)
Help on method backlinks in module pywikibot.page:

backlinks(followRedirects=True, filterRedirects=None, namespaces=None, total=None, content=False, step=NotImplemented) method of pywikibot.page.Page instance
    Return an iterator for pages that link to this page.

    @param followRedirects: if True, also iterate pages that link to a
        redirect pointing to the page.
    @param filterRedirects: if True, only iterate redirects; if False,
        omit redirects; if None, do not filter
    @param namespaces: only iterate pages in these namespaces
    @param total: iterate no more than this number of pages in total
    @param content: if True, retrieve the content of the current version
        of each referring page (default False)

How To Edit A Page

Suppose you want to change a page's text. How do you go about doing that?

First, you can access a page's text using the text attribute:

>>> print(page.text)
==Stage 1: Collecting System Data==

===COMPLETED Phase 1a: Netdata===

First, we set up [[Netdata]].
...

Once we've retrieved a page, we can update its text as follows:

>>> page.text = u"new page text"
>>> page.save(u"Log message for this edit")


All Available Methods

Page Object Methods

A list of all available methods for Page objects:

>>> dir(page)
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__unicode__',
 '__weakref__',
 '_applicable_protections',
 '_cache_attrs',
 '_cmpkey',
 '_contentmodel',
 '_cosmetic_changes_hook',
 '_getInternals',
 '_get_parsed_page',
 '_isredir',
 '_latest_cached_revision',
 '_link',
 '_namespace_obj',
 '_pageid',
 '_protection',
 '_revid',
 '_revisions',
 '_save',
 '_timestamp',
 'applicable_protections',
 'aslink',
 'autoFormat',
 'backlinks',
 'botMayEdit',
 'canBeEdited',
 'categories',
 'change_category',
 'clear_cache',
 'content_model',
 'contributingUsers',
 'contributors',
 'coordinates',
 'data_item',
 'data_repository',
 'defaultsort',
 'delete',
 'depth',
 'editTime',
 'embeddedin',
 'encoding',
 'exists',
 'expand_text',
 'extlinks',
 'fullVersionHistory',
 'full_url',
 'get',
 'getCategoryRedirectTarget',
 'getCreator',
 'getDeletedRevision',
 'getLatestEditors',
 'getMovedTarget',
 'getOldVersion',
 'getRedirectTarget',
 'getReferences',
 'getRestrictions',
 'getTemplates',
 'getVersionHistory',
 'getVersionHistoryTable',
 'image_repository',
 'imagelinks',
 'interwiki',
 'isAutoTitle',
 'isCategory',
 'isCategoryRedirect',
 'isDisambig',
 'isEmpty',
 'isFlowPage',
 'isImage',
 'isIpEdit',
 'isRedirectPage',
 'isStaticRedirect',
 'isTalkPage',
 'is_categorypage',
 'is_filepage',
 'is_flow_page',
 'iterlanglinks',
 'itertemplates',
 'langlinks',
 'lastNonBotUser',
 'latestRevision',
 'latest_revision',
 'latest_revision_id',
 'linkedPages',
 'loadDeletedRevisions',
 'markDeletedRevision',
 'merge_history',
 'move',
 'moved_target',
 'namespace',
 'oldest_revision',
 'pageAPInfo',
 'page_image',
 'pageid',
 'permalink',
 'preloadText',
 'previousRevision',
 'previous_revision_id',
 'properties',
 'protect',
 'protection',
 'purge',
 'put',
 'put_async',
 'raw_extracted_templates',
 'removeImage',
 'replaceImage',
 'revision_count',
 'revisions',
 'save',
 'section',
 'sectionFreeTitle',
 'set_redirect_target',
 'site',
 'templates',
 'templatesWithParams',
 'text',
 'title',
 'titleForFilename',
 'titleWithoutNamespace',
 'toggleTalkPage',
 'touch',
 'undelete',
 'urlname',
 'userName',
 'version',
 'watch']

Revision Object Methods

>>> revs = list(page.revisions())

>>> dir(revs[0])
['FullHistEntry',
 'HistEntry',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__unicode__',
 '__weakref__',
 '_content_model',
 '_parent_id',
 '_sha1',
 '_thank',
 'anon',
 'comment',
 'content_model',
 'full_hist_entry',
 'hist_entry',
 'minor',
 'parent_id',
 'revid',
 'rollbacktoken',
 'sha1',
 'text',
 'timestamp',
 'user']

Flags