This Forum has been archived

Forums: Admin Central Index Technical Help Deleting Nonsense Pages Bot
Central's forums are a place for the community to help other members.
To contact staff directly or to report bugs, please use Special:Contact.
Note: This topic has been unedited for 1589 days. It is considered archived - the discussion is over. Do not add to unless it really needs a response.

I have a spare account called Head.Boy.Hog.Bot and I thought it could come in handy on the Hogwarts RPG Wiki. I was wondering if it was possible to get it to automatically delete pages that are created and don't have much on it. (Like a few words.) Is it possible to do so? If it is, who would I do that? Thanks! Head.Boy.HogTalk  00:48,8/24/2011 

I'm not sure how a bot could identify a page as containing 'nonsense' content. And having a bot deleting a page is a bit risky, IMO, as if you don't control it, it may delete pages that have valid content. You could make it add the pages to a precise category then review the pages and delete them manually. Hunter789 01:23, August 24, 2011 (UTC)
Okay. Head.Boy.HogTalk  00:38,8/25/2011 

Well, here's the thing. I was intrigued by this challenge. I don't know why, because honestly, the software already gives you a report of small pages at Special:ShortPages. It's sort of redundant to make a bot which does this same task. And, like Hunter789 already pointed out, you do want to have human oversight over deletions based upon reasons of content.

But for some reason I'm still intrigued by this. Maybe because if we solved this, we could have bots that automatically labelled pages of 0 lenght, and therefore flag pages that had recently been blanked. I think there is an application for a bot that will count the nubmer of words on a page, despite Special:ShortPages, and so I've spent some time on it.

I'm kind of okay with regex and solutions, but I'm not brilliant. So this dowsn't work. But it might inspire someone else to figure out what I've done wrong. Here's my best shot at this so far. But again, it doesn't work. It runs, so it's not got syntax errors, but it doesn't catch anything. The basic theory here is that the regex expression is meant to find pages that have a "word boundary" number between 0 and 50, indicating an extremely short page. Then it's supposed to dump the contents of the page + a category, back on to the page.

fixes['length'] = {
    'regex': True,
    'msg': {
         'en':u'cat tagging exceptionally small articles'
    'replacements': [
        (r'(^[\b(\w+?)\b]{0,50}$)',r'\1[[category:Short articles]]'), 

Once that's in, it's then triggered by

python fix:length -cat:whatever 

So there ya go. Doesn't do what it's supposed to do, but I think it's really close. Again, maybe someone else will drop by and figure out why it's not quite working. czechout@fandom    fly tardis  <span style="">22:49:43 Thu 15 Sep 2011