Pywikibot/Pages: Difference between revisions
From charlesreid1
| Line 10: | Line 10: | ||
==Useful Actions== | ==Useful Actions== | ||
To build a graph of links on a given wiki, there are a couple of methods that are useful: | |||
* page.backlinks - lists pages that link to the given page | |||
* page.linkedPages - lists pages on this wiki that this page links to | |||
* page.extlinks - lists external targets that this page links to | |||
* page.getReferences - | |||
===backlinks=== | |||
<pre> | |||
In [43]: page.backlinks | |||
Out[43]: <bound method deprecated_args.<locals>.decorator.<locals>.wrapper of Page('AOCP/Binomial Coefficients')> | |||
In [44]: page.backlinks() | |||
Out[44]: <itertools.chain at 0x10838c828> | |||
In [45]: list(page.backlinks()) | |||
Out[45]: | |||
[Page('Flags'), | |||
Page('Algorithms/Combinatorics and Heuristics'), | |||
Page('Algorithms/Combinatorics'), | |||
Page('AOCP/Multisets'), | |||
Page('AOCP/Permutations'), | |||
Page('Template:AOCPFlag'), | |||
Page('AOCP/Multinomial Coefficients'), | |||
Page('AOCP/Harmonic Numbers'), | |||
Page('AOCP/Fibonacci Numbers'), | |||
Page('ACOP/Generating Functions'), | |||
Page('AOCP'), | |||
Page('AOCP/Generating Functions'), | |||
Page('Generating Functions'), | |||
Page('AOCP/Combinatorics'), | |||
Page('Cards'), | |||
Page('Binomial Coefficients'), | |||
Page('AOCP/Generating Permutations and Tuples'), | |||
Page('Letter Coverage'), | |||
Page('Five Letter Words')] | |||
In [46]: | |||
</pre> | |||
===linkedPages=== | |||
Asking for linkedPages() will return all pages that the current page contains links TO. This method returns a PageGenerator object, similar to the site's allpages() method. As before, we pass that into a list() method to return each item from the generator and construct a list from the results. | |||
<pre> | |||
In [46]: list(page.linkedPages()) | |||
Out[46]: | |||
[Page('AOCP/Boolean Functions'), | |||
Page('AOCP/Combinatorial Algorithms'), | |||
Page('AOCP/Infinite Series'), | |||
Page('Algorithm Analysis/Randomized Quick Sort'), | |||
Page('Algorithm Analysis/Substring Pattern Matching'), | |||
Page('ACOP/Generating Functions'), | |||
Page('AOCP/Combinatorics'), | |||
Page('AOCP/Fibonacci Numbers'), | |||
Page('AOCP/Five Letter Words'), | |||
Page('AOCP/Generating Permutations and Tuples'), | |||
Page('AOCP/Harmonic Numbers'), | |||
Page('AOCP/Multinomial Coefficients'), | |||
Page('AOCP/Multisets'), | |||
Page('Algorithm Analysis/Matrix Multiplication'), | |||
Page('Algorithm Analysis/Merge Sort'), | |||
Page('Algorithm complexity'), | |||
Page('Algorithmic Analysis of Sort Functions'), | |||
Page('Algorithms'), | |||
Page('Algorithms/Combinatorics'), | |||
Page('Algorithms/Combinatorics and Heuristics'), | |||
Page('Algorithms/Data Structures'), | |||
Page('Algorithms/Graphs'), | |||
Page('Algorithms/Optimization'), | |||
Page('Algorithms/Search'), | |||
Page('Algorithms/Sort'), | |||
Page('Algorithms/Strings'), | |||
Page('Amortization'), | |||
Page('Amortization/Accounting Method'), | |||
Page('Binary Search'), | |||
Page('Binary Search Modifications'), | |||
Page('CS'), | |||
Page('Cards'), | |||
Page('Divide and Conquer'), | |||
Page('Divide and Conquer/Master Theorem'), | |||
Page('Estimation'), | |||
Page('Estimation/BitsAndBytes'), | |||
Page('Five Letter Words'), | |||
Page('Flags'), | |||
Page('Heap Sort'), | |||
Page('Letter Coverage'), | |||
Page('Merge Sort'), | |||
Page('Project Euler'), | |||
Page('Quick Sort'), | |||
Page('Rubiks Cube/Permutations'), | |||
Page('Rubiks Cube/Tuples'), | |||
Page('Skiena Chapter 4 Questions'), | |||
Page('Theta vs Big O'), | |||
Page('Template:AOCPFlag'), | |||
Page('Template:AlgorithmsFlag'), | |||
Category('Category:AOCP')] | |||
In [47]: type(page.linkedPages()) | |||
Out[47]: pywikibot.data.api.PageGenerator | |||
</pre> | |||
===extlinks=== | |||
Asking for the external links on a given page will return a plain generator: | |||
<pre> | |||
In [48]: type(page.extlinks()) | |||
Out[48]: generator | |||
In [49]: list(page.extlinks()) | |||
Out[49]: | |||
['http://charlesreid1.com/w/index.php?title=Template:AOCPFlag&action=edit', | |||
'http://charlesreid1.com/w/index.php?title=Template:AlgorithmsFlag&action=edit', | |||
'https://charlesreid1.com:3000/cs/study-plan'] | |||
</pre> | |||
===getReferences=== | |||
Not sure how this is different from backlinks, but it is almost entirely the same (only one item is in backlinks but not in getReferences). | |||
<pre> | |||
In [50]: list(page.getReferences()) | |||
Out[50]: | |||
[Page('Flags'), | |||
Page('Algorithms/Combinatorics and Heuristics'), | |||
Page('Algorithms/Combinatorics'), | |||
Page('AOCP/Multisets'), | |||
Page('AOCP/Permutations'), | |||
Page('Template:AOCPFlag'), | |||
Page('AOCP/Multinomial Coefficients'), | |||
Page('AOCP/Harmonic Numbers'), | |||
Page('AOCP/Fibonacci Numbers'), | |||
Page('ACOP/Generating Functions'), | |||
Page('AOCP'), | |||
Page('AOCP/Generating Functions'), | |||
Page('Generating Functions'), | |||
Page('AOCP/Combinatorics'), | |||
Page('Cards'), | |||
Page('AOCP/Generating Permutations and Tuples'), | |||
Page('Letter Coverage'), | |||
Page('Five Letter Words')] | |||
In [51]: type(page.getReferences()) | |||
Out[51]: itertools.islice | |||
In [52]: type(page.backlinks()) | |||
Out[52]: itertools.chain | |||
</pre> | |||
If we ask for some help, we can see the difference between these two methods: | |||
<pre> | |||
In [54]: help(page.getReferences) | |||
Help on method getReferences in module pywikibot.page: | |||
getReferences(follow_redirects=True, withTemplateInclusion=True, onlyTemplateInclusion=False, redirectsOnly=False, namespaces=None, total=None, content=False, step=NotImplemented) method of pywikibot.page.Page instance | |||
Return an iterator all pages that refer to or embed the page. | |||
If you need a full list of referring pages, use | |||
C{pages = list(s.getReferences())} | |||
@param follow_redirects: if True, also iterate pages that link to a | |||
redirect pointing to the page. | |||
@param withTemplateInclusion: if True, also iterate pages where self | |||
is used as a template. | |||
@param onlyTemplateInclusion: if True, only iterate pages where self | |||
is used as a template. | |||
@param redirectsOnly: if True, only iterate redirects to self. | |||
@param namespaces: only iterate pages in these namespaces | |||
@param total: iterate no more than this number of pages in total | |||
@param content: if True, retrieve the content of the current version | |||
of each referring page (default False) | |||
In [55]: help(page.backlinks) | |||
Help on method backlinks in module pywikibot.page: | |||
backlinks(followRedirects=True, filterRedirects=None, namespaces=None, total=None, content=False, step=NotImplemented) method of pywikibot.page.Page instance | |||
Return an iterator for pages that link to this page. | |||
@param followRedirects: if True, also iterate pages that link to a | |||
redirect pointing to the page. | |||
@param filterRedirects: if True, only iterate redirects; if False, | |||
omit redirects; if None, do not filter | |||
@param namespaces: only iterate pages in these namespaces | |||
@param total: iterate no more than this number of pages in total | |||
@param content: if True, retrieve the content of the current version | |||
of each referring page (default False) | |||
</pre> | |||
==All Available Methods== | ==All Available Methods== | ||
Revision as of 22:00, 31 January 2018
Page Objects
These represent MediaWiki pages, obviously.
Many of Site object's methods return Pages or PageGenerators (see Pywikibot/Sites).
Revision Objects
Each page has an edit history, consisting of different versions of the page at different points in time. Revision objects represent atomic information about a given revision of a given page.
Useful Actions
To build a graph of links on a given wiki, there are a couple of methods that are useful:
- page.backlinks - lists pages that link to the given page
- page.linkedPages - lists pages on this wiki that this page links to
- page.extlinks - lists external targets that this page links to
- page.getReferences -
backlinks
In [43]: page.backlinks
Out[43]: <bound method deprecated_args.<locals>.decorator.<locals>.wrapper of Page('AOCP/Binomial Coefficients')>
In [44]: page.backlinks()
Out[44]: <itertools.chain at 0x10838c828>
In [45]: list(page.backlinks())
Out[45]:
[Page('Flags'),
Page('Algorithms/Combinatorics and Heuristics'),
Page('Algorithms/Combinatorics'),
Page('AOCP/Multisets'),
Page('AOCP/Permutations'),
Page('Template:AOCPFlag'),
Page('AOCP/Multinomial Coefficients'),
Page('AOCP/Harmonic Numbers'),
Page('AOCP/Fibonacci Numbers'),
Page('ACOP/Generating Functions'),
Page('AOCP'),
Page('AOCP/Generating Functions'),
Page('Generating Functions'),
Page('AOCP/Combinatorics'),
Page('Cards'),
Page('Binomial Coefficients'),
Page('AOCP/Generating Permutations and Tuples'),
Page('Letter Coverage'),
Page('Five Letter Words')]
In [46]:
linkedPages
Asking for linkedPages() will return all pages that the current page contains links TO. This method returns a PageGenerator object, similar to the site's allpages() method. As before, we pass that into a list() method to return each item from the generator and construct a list from the results.
In [46]: list(page.linkedPages())
Out[46]:
[Page('AOCP/Boolean Functions'),
Page('AOCP/Combinatorial Algorithms'),
Page('AOCP/Infinite Series'),
Page('Algorithm Analysis/Randomized Quick Sort'),
Page('Algorithm Analysis/Substring Pattern Matching'),
Page('ACOP/Generating Functions'),
Page('AOCP/Combinatorics'),
Page('AOCP/Fibonacci Numbers'),
Page('AOCP/Five Letter Words'),
Page('AOCP/Generating Permutations and Tuples'),
Page('AOCP/Harmonic Numbers'),
Page('AOCP/Multinomial Coefficients'),
Page('AOCP/Multisets'),
Page('Algorithm Analysis/Matrix Multiplication'),
Page('Algorithm Analysis/Merge Sort'),
Page('Algorithm complexity'),
Page('Algorithmic Analysis of Sort Functions'),
Page('Algorithms'),
Page('Algorithms/Combinatorics'),
Page('Algorithms/Combinatorics and Heuristics'),
Page('Algorithms/Data Structures'),
Page('Algorithms/Graphs'),
Page('Algorithms/Optimization'),
Page('Algorithms/Search'),
Page('Algorithms/Sort'),
Page('Algorithms/Strings'),
Page('Amortization'),
Page('Amortization/Accounting Method'),
Page('Binary Search'),
Page('Binary Search Modifications'),
Page('CS'),
Page('Cards'),
Page('Divide and Conquer'),
Page('Divide and Conquer/Master Theorem'),
Page('Estimation'),
Page('Estimation/BitsAndBytes'),
Page('Five Letter Words'),
Page('Flags'),
Page('Heap Sort'),
Page('Letter Coverage'),
Page('Merge Sort'),
Page('Project Euler'),
Page('Quick Sort'),
Page('Rubiks Cube/Permutations'),
Page('Rubiks Cube/Tuples'),
Page('Skiena Chapter 4 Questions'),
Page('Theta vs Big O'),
Page('Template:AOCPFlag'),
Page('Template:AlgorithmsFlag'),
Category('Category:AOCP')]
In [47]: type(page.linkedPages())
Out[47]: pywikibot.data.api.PageGenerator
extlinks
Asking for the external links on a given page will return a plain generator:
In [48]: type(page.extlinks()) Out[48]: generator In [49]: list(page.extlinks()) Out[49]: ['http://charlesreid1.com/w/index.php?title=Template:AOCPFlag&action=edit', 'http://charlesreid1.com/w/index.php?title=Template:AlgorithmsFlag&action=edit', 'https://charlesreid1.com:3000/cs/study-plan']
getReferences
Not sure how this is different from backlinks, but it is almost entirely the same (only one item is in backlinks but not in getReferences).
In [50]: list(page.getReferences())
Out[50]:
[Page('Flags'),
Page('Algorithms/Combinatorics and Heuristics'),
Page('Algorithms/Combinatorics'),
Page('AOCP/Multisets'),
Page('AOCP/Permutations'),
Page('Template:AOCPFlag'),
Page('AOCP/Multinomial Coefficients'),
Page('AOCP/Harmonic Numbers'),
Page('AOCP/Fibonacci Numbers'),
Page('ACOP/Generating Functions'),
Page('AOCP'),
Page('AOCP/Generating Functions'),
Page('Generating Functions'),
Page('AOCP/Combinatorics'),
Page('Cards'),
Page('AOCP/Generating Permutations and Tuples'),
Page('Letter Coverage'),
Page('Five Letter Words')]
In [51]: type(page.getReferences())
Out[51]: itertools.islice
In [52]: type(page.backlinks())
Out[52]: itertools.chain
If we ask for some help, we can see the difference between these two methods:
In [54]: help(page.getReferences)
Help on method getReferences in module pywikibot.page:
getReferences(follow_redirects=True, withTemplateInclusion=True, onlyTemplateInclusion=False, redirectsOnly=False, namespaces=None, total=None, content=False, step=NotImplemented) method of pywikibot.page.Page instance
Return an iterator all pages that refer to or embed the page.
If you need a full list of referring pages, use
C{pages = list(s.getReferences())}
@param follow_redirects: if True, also iterate pages that link to a
redirect pointing to the page.
@param withTemplateInclusion: if True, also iterate pages where self
is used as a template.
@param onlyTemplateInclusion: if True, only iterate pages where self
is used as a template.
@param redirectsOnly: if True, only iterate redirects to self.
@param namespaces: only iterate pages in these namespaces
@param total: iterate no more than this number of pages in total
@param content: if True, retrieve the content of the current version
of each referring page (default False)
In [55]: help(page.backlinks)
Help on method backlinks in module pywikibot.page:
backlinks(followRedirects=True, filterRedirects=None, namespaces=None, total=None, content=False, step=NotImplemented) method of pywikibot.page.Page instance
Return an iterator for pages that link to this page.
@param followRedirects: if True, also iterate pages that link to a
redirect pointing to the page.
@param filterRedirects: if True, only iterate redirects; if False,
omit redirects; if None, do not filter
@param namespaces: only iterate pages in these namespaces
@param total: iterate no more than this number of pages in total
@param content: if True, retrieve the content of the current version
of each referring page (default False)
All Available Methods
Page Object Methods
A list of all available methods for Page objects:
>>> dir(page) ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_applicable_protections', '_cache_attrs', '_cmpkey', '_contentmodel', '_cosmetic_changes_hook', '_getInternals', '_get_parsed_page', '_isredir', '_latest_cached_revision', '_link', '_namespace_obj', '_pageid', '_protection', '_revid', '_revisions', '_save', '_timestamp', 'applicable_protections', 'aslink', 'autoFormat', 'backlinks', 'botMayEdit', 'canBeEdited', 'categories', 'change_category', 'clear_cache', 'content_model', 'contributingUsers', 'contributors', 'coordinates', 'data_item', 'data_repository', 'defaultsort', 'delete', 'depth', 'editTime', 'embeddedin', 'encoding', 'exists', 'expand_text', 'extlinks', 'fullVersionHistory', 'full_url', 'get', 'getCategoryRedirectTarget', 'getCreator', 'getDeletedRevision', 'getLatestEditors', 'getMovedTarget', 'getOldVersion', 'getRedirectTarget', 'getReferences', 'getRestrictions', 'getTemplates', 'getVersionHistory', 'getVersionHistoryTable', 'image_repository', 'imagelinks', 'interwiki', 'isAutoTitle', 'isCategory', 'isCategoryRedirect', 'isDisambig', 'isEmpty', 'isFlowPage', 'isImage', 'isIpEdit', 'isRedirectPage', 'isStaticRedirect', 'isTalkPage', 'is_categorypage', 'is_filepage', 'is_flow_page', 'iterlanglinks', 'itertemplates', 'langlinks', 'lastNonBotUser', 'latestRevision', 'latest_revision', 'latest_revision_id', 'linkedPages', 'loadDeletedRevisions', 'markDeletedRevision', 'merge_history', 'move', 'moved_target', 'namespace', 'oldest_revision', 'pageAPInfo', 'page_image', 'pageid', 'permalink', 'preloadText', 'previousRevision', 'previous_revision_id', 'properties', 'protect', 'protection', 'purge', 'put', 'put_async', 'raw_extracted_templates', 'removeImage', 'replaceImage', 'revision_count', 'revisions', 'save', 'section', 'sectionFreeTitle', 'set_redirect_target', 'site', 'templates', 'templatesWithParams', 'text', 'title', 'titleForFilename', 'titleWithoutNamespace', 'toggleTalkPage', 'touch', 'undelete', 'urlname', 'userName', 'version', 'watch'] |
Revision Object Methods
>>> revs = list(page.revisions()) >>> dir(revs[0]) ['FullHistEntry', 'HistEntry', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__unicode__', '__weakref__', '_content_model', '_parent_id', '_sha1', '_thank', 'anon', 'comment', 'content_model', 'full_hist_entry', 'hist_entry', 'minor', 'parent_id', 'revid', 'rollbacktoken', 'sha1', 'text', 'timestamp', 'user'] |