Wikia

Community Central

Admin Forum:Trouble with pywikipediabot

Talk0
94,062pages on
this wiki

This Forum has been archived

Forums: Admin Central Index Technical Help Trouble with pywikipediabot
Wikia's forums are a place for the community to help other members.
To contact staff directly or to report bugs, please use Special:Contact.


I've been using my bot YodaBot on both the PathfinderWiki and Oscars Wiki for several years. Recently, however, I got a new computer and just set up the bot on the new machine. I've followed the instructions in this extremely helpful blog post and managed to get everything set up (I think). It lets me log in and a "login.py -test" command line confirms that. But when I try to edit, I get errors that I don't have a token and won't be able to edit pages. Can anyone offer any assistance in this? Without the ability to use the bot, categorizing images is going to make for some really, really tedious editing in the future. — yoda8myhead (talk) 08:21, December 12, 2012 (UTC)

Try setting it back up from scratch if you haven't already. Do you have the most up-to-date version of Python installed? (2.7.3) --Callofduty4 (talk) 09:27, December 12, 2012 (UTC)
Could you provide a few more details, please?
  • OS (and version number)
  • Python version number
  • Pywikipedia version number
  • Whether you have access to the known working pywikipedia folder on your old computer
  • Whether you're sure you're using the right family.py file for the wiki you're trying to edit
  • Whether you're absolutely sure that your account is bot-flagged on the wiki you're trying to edit
Having recently upgraded my own computer recently, I'd say that your initial approach would be to just copy your old, working pywikipedia folder over to your new computer, change its permissions (depending on OS) so that the current account can access the files within that folder, and there ya go.
But, as I said, I don't really know enough details about your situation to really diagnose. czechout@Wikia    fly tardis 19:16: Wed 12 Dec 2012
Okay, I can see that you're a flagged bot on both oscars and pathfinder, but I can also see that it's been at least a year since you've made any edits with that account — on pathfinder, it's been almost three years. If you're trying to use the same package of files that you used back in the day, you'll probably have some difficulties.
The thing I'm thinking now is just that your family files aren't right. How 'bout posting them in their entirety, as well as user-config.py, and maybe I can spot the issue for you. czechout@Wikia    fly tardis 19:26: Wed 12 Dec 2012
Oh, another thing: it would be helpful to know exactly what you were trying to do when you got the error message. Could you cut and paste the command you were trying to give? Also, please put this into your command line. It'll harmlessly test whether things are generally working or not. Log into oscars and then type
python pagegenerators.py -cat:"2010 nominated films"
That should just generate a list of the movies in the category, and thereby prove that the error is in whatever command you issued, not your general setup. If that instance of pagegenerators produces an error of any kind, please post that error back here. czechout@Wikia    fly tardis 19:34: Wed 12 Dec 2012
Thanks. I'll try all those when I get home tonight. —yoda8myhead (talk) 19:38, December 12, 2012 (UTC)
Oh, I should also point out that it's entirely possible you've simply been unlucky. I've not been running my bot while I've been posting to you, so I'm not aware of any problems with the api. But there are times where you'll get that "invalid token" error purely because of some sort of behind-the-scenes connectivity issue at Wikia itself. There've been plenty of times I've been in a big bot run and had the thing start throwing me tons of those "invalid token" errors. They just keep coming until they eventually stop, and there's not a thing you can do about it. czechout@Wikia    fly tardis 19:44: Wed 12 Dec 2012
Ok, so I tested the command line you provided above and it worked perfectly. I'm still getting an error when entering the following, however:
python category.py add -file:nuke.txt
I'm on a Mac running OSX 10.8.2, Python 2.7, pywikipedia nightly build 2012-12-10
Let me know if you need to see my user-config and family files, and thanks for the help. —yoda8myhead (talk) 08:08, December 13, 2012 (UTC)
Okay so you're using the Python that came pre-installed with Mountain Lion, it sounds like. That's what I'm using, too, so we can probably rule out Python, and there's unlikely an error with the category.py code if you're dealing with a build from this week. Not impossible just unlikely. My next test would be to run
python category.py add -cat:"2010 nominated films"
That should bring up a prompt asking you what category you want to add. Choose a silly name like TestyTestyMcTest and enter that. The bot should then start to work, giving you the option of whether to put it on the first page in that category.
  • If it doesn't work, then we've got a bigger problem, and you may want to re-download just the category.py code, or I can just cut and paste a working version here, if you'd like.
  • If it does work, and mu suspicion is that it will, then the problem is almost certainly in your nuke.txt file. I've just tested creating a file of that name and using it with category.py and had no issue. But here's the checklist of things to look at. Each page name must be enclosed in brackets. So, I dunno,
    [[Gandhi]][Tootsie]][[Casino Royale]][[The Return of the King]][[Kung-Fu Panda]][[Oceans 13]]
    In my experience, it's best if it's UTF-8, absolutely-brain-dead-stupid plain text. So put your text into TextEdit and press SHIFT-CMND-T (or go under format and pull down to "make plain text"). Then, for simplicity sake, save the text file directly into your pywikipedia folder. That should work, as I've just precisely followed those instructions myself.
The other thing I'd say is that files with category.py aren't really necessary. By the time you've created a file that you could use with the -file:filename nomenclature, you could have just run the category.py bot by choosing to approve each page manually. I'm obviously not sure exactly what you're trying to do, but if you can get the test I suggested above to work, consider running the bot like that. It's almost certainly a lot faster to go manual than to create a text file and then run from the text file. czechout@Wikia    fly tardis 15:06: Thu 13 Dec 2012
Well, your test didn't work, and I tried copying just the category.py code into my directory and that didn't fix it. Thinking perhaps it was that script, I tried running a few others, and anything that's just scraping the wiki for lists or counts of something seems to work fine. But nothing is letting me edit pages, giving me a token error every time. I fear my family and/or user-config files are wrong. See the contents of both below:
import family
 
class Family(family.Family):
   def __init__(self):
       family.Family.__init__(self)
       self.name = 'oscars'
 
       self.langs = {
           'en': None,
           }
 
       # Translation used on all wikis for the different namespaces.
       # Most namespaces are inherited from family.Family.
       # Check the family.py file (in main directory) to see the standard
       # namespace translations for each known language.
       # You only need to enter translations that differ from the default.
       self.namespaces[4] = {
           '_default': u'oscars', # Specify the project namespace here. 
       }
       self.namespaces[5] = {
           '_default': u'oscars talk', # Specify the talk page of the project namespace here. 
       }
 
       # A few selected big languages for things that we do not want to loop over
       # all languages. This is only needed by the titletranslate.py module, so
       # if you carefully avoid the options, you could get away without these
       # for another wiki family.
       self.languages_by_size = ['en']
   def hostname(self,code):
       return 'oscars.wikia.com'
   def scriptpath(self, code):
       return ''
   def version(self, code):
       return "1.9" # Which version of MediaWiki is used?
mylang = 'en'
family = 'oscars'
usernames['oscars']['en'] = 'YodaBot'
console_encoding = 'utf-8'
use_api_login = True
sysopnames['oscars']['en']='YodaBot'
yoda8myhead (talk) 05:19, December 14, 2012 (UTC)
Ahhh. Yeah, try resetting your self.name to
self.name = 'oscars_wikia'
czechout@Wikia    fly tardis 05:56: Fri 14 Dec 2012
Oh, and self.langs should probably be
self.langs = { 'en':'oscars.wikia.com' }
czechout@Wikia    fly tardis 05:58: Fri 14 Dec 2012

And you'll probably need to update your namespaces.

        self.namespaces[4] = {
            '_default': u'Oscars',
        }
        self.namespaces[5] = {
            '_default': u'Oscars talk',
        }
        self.namespaces[110] = {
            '_default': u'Forum',
        }
        self.namespaces[111] = {
            '_default': u'Forum talk',
        }
        self.namespaces[1100] = {
            '_default': u'RelatedVideos',
        }
        self.namespaces[1200] = {
            '_default': u'Message Wall',
        }
        self.namespaces[1201] = {
            '_default': u'Thread',
        }
        self.namespaces[2000] = {
            '_default': u'Board',
        }
        self.namespaces[2001] = {
            '_default': u'Board Talk',
        }
        self.namespaces[2002] = {
            '_default': u'Topic',
        }

czechout@Wikia    fly tardis 06:08: Fri 14 Dec 2012

And this is how the last bit should be:
    def version(self, code):
        return "1.19.2"

    def scriptpath(self, code):
        return ''

czechout@Wikia    fly tardis 06:12: Fri 14 Dec 2012

Changes made, now getting:

Traceback (most recent call last):
  File "login.py", line 58, in <module>
    import re, os, query
  File "/Users/markmoreland/pywikipedia/query.py", line 28, in <module>
    import wikipedia, time
  File "/Users/markmoreland/pywikipedia/wikipedia.py", line 7889, in <module>
    getSite(noLogin=True)
  File "/Users/markmoreland/pywikipedia/wikipedia.py", line 7679, in getSite
    persistent_http=persistent_http)
  File "/Users/markmoreland/pywikipedia/wikipedia.py", line 4926, in __init__
    self.family = Family(fam, fatal = False)
  File "/Users/markmoreland/pywikipedia/wikipedia.py", line 4734, in Family
    myfamily = __import__('%s_family' % fam)
  File "/Users/markmoreland/pywikipedia/families/oscars_family.py", line 17
    self.namespaces[4] = {
    ^
IndentationError: unexpected indent

yoda8myhead (talk) 06:17, December 14, 2012 (UTC)

Lol. I think I may have confused you by doing a piecemeal approach to the family file rewrite. Try completely replacing your family file with this text:

# -*- coding: utf-8  -*-
import family, config

class Family(family.Family):
    def __init__(self):
        family.Family.__init__(self)

        self.name = 'oscars_wikia'
        self.langs = { 'en':'oscars.wikia.com' }


        # Most namespaces are inherited from family.Family.
        self.namespaces[4] = {
            '_default': u'Oscars',
        }
        self.namespaces[5] = {
            '_default': u'Oscars talk',
        }
        self.namespaces[110] = {
            '_default': u'Forum',
        }
        self.namespaces[111] = {
            '_default': u'Forum talk',
        }
        self.namespaces[1100] = {
            '_default': u'RelatedVideos',
        }
        self.namespaces[1200] = {
            '_default': u'Message Wall',
        }
        self.namespaces[1201] = {
            '_default': u'Thread',
        }
        self.namespaces[2000] = {
            '_default': u'Board',
        }
        self.namespaces[2001] = {
            '_default': u'Board Talk',
        }
        self.namespaces[2002] = {
            '_default': u'Topic',
        }

        # A few selected big languages for things that we do not want to loop over
        # all languages. This is only needed by the titletranslate.py module, so
        # if you carefully avoid the options, you could get away without these
        # for another wikimedia family.

        self.languages_by_size = ['en','de']

    def version(self, code):
        return "1.19.2"

    def scriptpath(self, code):
        return ''

czechout@Wikia    fly tardis 06:20: Fri 14 Dec 2012

(If all we're getting is an indentation error regarding the namespace 4 definition, we're very, very close to gettin' this puppy workin'. czechout@Wikia    fly tardis 06:25: Fri 14 Dec 2012)
I now get
Traceback (most recent call last):
  File "login.py", line 58, in <module>
    import re, os, query
  File "/Users/markmoreland/pywikipedia/query.py", line 28, in <module>
    import wikipedia, time
  File "/Users/markmoreland/pywikipedia/wikipedia.py", line 142, in <module>
    from pywikibot import *
  File "/Users/markmoreland/pywikipedia/pywikibot/__init__.py", line 15, in <module>
    from exceptions import *
  File "/Users/markmoreland/pywikipedia/pywikibot/exceptions.py", line 13, in <module>
    import config
  File "/Users/markmoreland/pywikipedia/config.py", line 556, in <module>
    execfile(_filename)
  File "/Users/markmoreland/pywikipedia/user-config.py", line 3, in <module>
    usernames['oscars_wikia']['en'] = 'YodaBot'
KeyError: 'oscars_wikia'

yoda8myhead (talk) 06:44, December 14, 2012 (UTC)

Could you please post your user-config.py, please? czechout@Wikia    fly tardis 07:06: Fri 14 Dec 2012
mylang = 'en'
family = 'oscars'
usernames['oscars_wikia']['en'] = 'YodaBot'
console_encoding = 'utf-8'
use_api_login = True
sysopnames['oscars_wikia']['en']='YodaBot'

yoda8myhead (talk) 07:13, December 14, 2012 (UTC)

Okay, lemme leave the user-config.py file to one side for the moment. I was able to reproduce your KeyError message, and to get a fix. Please go back into your families folder and look at the file for oscars_wikia_family file. My guess is that you don't have the extension .py after that name. Change the name to oscars_wikia_family.py, try relogging, and tell me what happens, please. czechout@Wikia    fly tardis 07:21: Fri 14 Dec 2012
That seems to have fixed it. I'm getting some namespace errors but it's telling me how to fix those, so I think I'm ok. Thanks for all your help! —yoda8myhead (talk) 07:26, December 14, 2012 (UTC)

Great news! But I'm still a little concerned about your user-config.py. This is what mine looks like, adjusted for easy cut-and-pasting for you. I think it's useful to retain the notes, because I've gone back from time to time to tinker with various sections. Even if you don't personally like all the notes, make sure that you add socket_timeout = none. I'm pretty sure that's still necessary, even in Mountain Lion.


# -*- coding: utf-8  -*-

# This is an automatically generated file. You can find more configuration parameters in 'config.py' file.

# The family of sites we are working on. wikipedia.py will import
# families/xxx_family.py so if you want to change this variable,
# you need to write such a file.

family = 'oscars_wikia'

# The language code of the site we're working on.
mylang = 'en'


# The dictionary usernames should contain a username for each site where you
# have a bot account.
usernames['oscars_wikia']['en'] = u'YodaBot'
sysopnames['oscars_wikia']['en'] = u'YodaBot'

#Fix for Mac OS X
socket_timeout = None

############## LOGFILE SETTINGS ##############

# Defines for which scripts a logfile should be enabled. Logfiles will be
# saved in the 'logs' subdirectory.
# Example:
#     log = ['interwiki', 'weblinkchecker', 'table2wiki']
# It is also possible to enable logging for all scripts, using this line:
#     log = ['*']
# To disable all logging, use this:
#     log = []
# Per default, logging of interwiki.py is enabled because its logfiles can
# be used to generate so-called warnfiles.
# This setting can be overridden by the -log or -nolog command-line arguments.
log = ['interwiki']



############## INTERWIKI SETTINGS ##############

# Should interwiki.py report warnings for missing links between foreign
# languages?
interwiki_backlink = True

# Should interwiki.py display every new link it discovers?
interwiki_shownew = True

# Should interwiki.py output a graph PNG file on conflicts?
# You need pydot for this: http://dkbza.org/pydot.html
interwiki_graph = False

# Specifies that the robot should process that amount of subjects at a time,
# only starting to load new pages in the original language when the total
# falls below that number. Default is to process (at least) 100 subjects at
# once.
interwiki_min_subjects = 100

# If interwiki graphs are enabled, which format(s) should be used?
# Supported formats include png, jpg, ps, and svg. See:
# http://www.graphviz.org/doc/info/output.html
# If you want to also dump the dot files, you can use this in your
# user-config.py:
# interwiki_graph_formats = ['dot', 'png']
# If you need a PNG image with an HTML image map, use this:
# interwiki_graph_formats = ['png', 'cmap']
# If you only need SVG images, use:
# interwiki_graph_formats = ['svg']
interwiki_graph_formats = ['png']

# You can post the contents of your autonomous_problems.dat to the wiki,
# e.g. to http://de.wikipedia.org/wiki/Wikipedia:Interwiki-Konflikte .
# This allows others to assist you in resolving interwiki problems.
# To help these people, you can upload the interwiki graphs to your
# webspace somewhere. Set the base URL here, e.g.:
# 'http://www.example.org/~yourname/interwiki-graphs/'
interwiki_graph_url = None

# Save file with local articles without interwikis.
without_interwiki = False

# Experimental feature:
# Store the page contents on disk (/cache/ directory) instead of loading
# them in RAM.
interwiki_contents_on_disk = False

############## SOLVE_DISAMBIGUATION SETTINGS ############
#
# Set disambiguation_comment[FAMILY][LANG] to a non-empty string to override
# the default edit comment for the solve_disambiguation bot.
# Use %s to represent the name of the disambiguation page being treated.
# Example:
#
# disambiguation_comment['wikipedia']['en'] = \
#    "Robot-assisted disambiguation ([[WP:DPL|you can help!]]): %s"

sort_ignore_case = False


############## IMAGE RELATED SETTINGS ##############
# If you set this to True, images will be uploaded to Wikimedia
# Commons by default.
upload_to_commons = False


############## TABLE CONVERSION BOT SETTINGS ##############

# will split long paragraphs for better reading the source.
# only table2wiki.py use it by now
splitLongParagraphs = False
# sometimes HTML-tables are indented for better reading.
# That can do very ugly results.
deIndentTables = True
# table2wiki.py works quite stable, so you might switch to True
table2wikiAskOnlyWarnings = True
table2wikiSkipWarnings = False

############## WEBLINK CHECKER SETTINGS ##############

# How many external links should weblinkchecker.py check at the same time?
# If you have a fast connection, you might want to increase this number so
# that slow servers won't slow you down.
max_external_links = 50

report_dead_links_on_talk = False

############## DATABASE SETTINGS ##############
db_hostname = 'localhost'
db_username = 'wikiuser'
db_password = ''


############## SEARCH ENGINE SETTINGS ##############

# Some scripts allow querying Google via the Google Web API. To use this feature,
# you must install the pyGoogle module from http://pygoogle.sf.net/ and have a
# Google Web API license key. Note that Google doesn't give out license keys
# anymore.
# --------------------
# Google web API is obsoleted for long time, now we can use Google AJAX Search API,
# You can signup an API key from http://code.google.com/apis/ajaxsearch/signup.html.
google_key = ''


# using Google AJAX Search API, it require the refer website, this variable save the refer web address
# when you sign up the Key.
google_api_refer = ''

# Some scripts allow using the Yahoo! Search Web Services. To use this feature,
# you must install the pYsearch module from http://pysearch.sourceforge.net/
# and get a Yahoo AppID from http://developer.yahoo.com
yahoo_appid = ''

# To use Windows Live Search web service you must get an AppID from
# http://search.msn.com/developer
msn_appid = ''

# Using the Flickr api
flickr = {
    'api_key': None,  # Provide your key!
    'review': False,  # Do we use automatically make our uploads reviewed?
    'reviewer': None, # If so, under what reviewer name?
    }

# for all connection proxy handle
# to use it, proxy['host'] have to support HTTP and include port number (e.g. localhost:8080)
# if proxy server neen authentication, set ('ID', 'PASSWORD') to proxy['auth'].
proxy = {
    'host': None,
    'auth': None,
}


############## COPYRIGHT SETTINGS ##############

# Enable/disable search engine in copyright.py script
copyright_google = True
copyright_yahoo = True
copyright_msn = False

# Perform a deep check, loading URLs to search if 'Wikipedia' is present.
# This may be useful to improve number of correct results. If you haven't
# a fast connection, you might want to keep they disabled.
copyright_check_in_source_google = False
copyright_check_in_source_yahoo = False
copyright_check_in_source_msn = False

# Web pages may content a Wikipedia text without 'Wikipedia' word but with
# typical '[edit]' tag result of copy & paste procedure. You can want no
# report for this kind of URLs, even if they are copyright violation.
# However, when enabled these URLs are logged in a file.

copyright_check_in_source_section_names = False

# Limit number of queries for page.
copyright_max_query_for_page = 25

# Skip a specified number of queries
copyright_skip_query = 0

# Number of attempts on connection error.
copyright_connection_tries = 10

# Behavior if an exceeded error occur.
#
# Possibilities:
#
#    0 = None
#    1 = Disable search engine
#    2 = Sleep (default)
#    3 = Stop

copyright_exceeded_in_queries = 2
copyright_exceeded_in_queries_sleep_hours = 6

# Append last modified date of URL to script result
copyright_show_date = True

# Append length of URL to script result
copyright_show_length = True

# By default the script try to identify and skip text that contents a wide
# comma separated list or only numbers. But sometimes that might be the
# only part unmodified of a slightly edited and not otherwise reported
# copyright violation. You can disable this feature to try to increase
# number of results.
copyright_economize_query = True


############## FURTHER SETTINGS ##############

# The bot can make some additional changes to each page it edits, e.g. fix
# whitespace or positioning of interwiki and category links.

# This is an experimental feature; handle with care and consider re-checking
# each bot edit if enabling this!
cosmetic_changes = False

# If cosmetic changes are switched on, and you also have several accounts at
# projects where you're not familiar with the local conventions, you probably
# only want the bot to do cosmetic changes on your "home" wiki which you
# specified in config.mylang and config.family.
# If you want the bot to also do cosmetic changes when editing a page on a
# foreign wiki, set cosmetic_changes_mylang_only to False, but be careful!
cosmetic_changes_mylang_only = True
# The dictionary cosmetic_changes_enable should contain a tuple of languages
# for each site where you wish to enable in addition to your own langlanguage
# (if cosmetic_changes_mylang_only is set)
# Please set your dictionary by adding such lines to your user-config.py:
# cosmetic_changes_enable['wikipedia'] = ('de', 'en', 'fr')
cosmetic_changes_enable = {}
# The dictionary cosmetic_changes_disable should contain a tuple of languages
# for each site where you wish to disable cosmetic changes. You may use it with
# cosmetic_changes_mylang_only is False, but you can also disable your own
# language. This also overrides the settings in the cosmetic_changes_enable
# dictionary. Please set your dict by adding such lines to your user-config.py:
# cosmetic_changes_disable['wikipedia'] = ('de', 'en', 'fr')
cosmetic_changes_disable = {}
# Use the experimental disk cache to prevent huge memory usage
use_diskcache = False

# Retry loading a page on failure (back off 1 minute, 2 minutes, 4 minutes
# up to 30 minutes)
retry_on_fail = True

# End of configuration section

By the way, if you're wondering how to switch back and forth between your two wikis, just add another family = 'wikiname_wikia' line underneath the oscars one, as well as additional lines under usernames and sysopnames and deactivate the ones you don't want by putting a pound sign in front of them.

Here's what my actual user-config.py file looks like right now:

#family = 'tardis_wikia'
family = 'oscars_wikia'
#family = 'factionparadox_wikia'
#family = 'pybot_wikia'
# The language code of the site we're working on.
mylang = 'en'


# The dictionary usernames should contain a username for each site where you
# have a bot account.
usernames['oscars_wikia']['en'] = u'CzechBot'
sysopnames['oscars_wikia']['en'] = u'CzechBot'
#usernames['tardis_wikia']['en'] = u'CzechBot'
#sysopnames['tardis_wikia']['en'] = u'CzechBot'
#usernames['factionparadox_wikia']['en'] = u'CzechBot'
#sysopnames['factionparadox_wikia']['en'] = u'CzechBot'
#usernames['pywikipediabot_wikia']['en'] = u'CzechBot' 
#sysopnames['pywikipediabot_wikia']['en'] = u'CzechBot'

There's probably a more elegant way to switch between accounts, but this method of manually adding/removing pound signs (hashtags) works well enough. czechout@Wikia    fly tardis 07:41: Fri 14 Dec 2012

Thanks again! —yoda8myhead (talk) 07:50, December 14, 2012 (UTC)
I've successfully added w:c:oscars:category:This is just a test to every page in w:c:oscars:category:Oscars by year. Everything seems to be working — although, as you say, you'll have to correct your family file for the currently-active namespaces. I'll leave you to use YodaBot to remove that category. czechout@Wikia    fly tardis 07:53: Fri 14 Dec 2012

Most if not all python scripts have the -family parameter (also a -lang parameter, btw; just run any script with -help) to one-time select another family than the one selected in the user-config.py, just as long as there is a usernames/sysopnames entry given for that family-lang combination. This also implies that you do not need to comment out usernames/sysopnames entries for alternative family-lang combinations.--PedroM

Duh. Of course that's what ya do. Thanks for giving the memory a bit of a jog. I will say, however, that the advantage of commenting out families in my particular situation of managing several wikis is that I know which wiki I'm looking at when I'm looking at my Console windows. If I'm working with w:c:tardis it cannot be w:c:faction paradox on my Console screens. I think I evolved this way of working precisely because I needed a firewall between factionparadox and tardis, since they largely have the same categories and share quite a few page names. But other users would almost certainly find the -family trick more palatable than opening their user-config.py file every time they wanted to change wikis. czechout@Wikia    fly tardis 18:25: Fri 14 Dec 2012

Around Wikia's network

Random Wiki