Community Central
Community Central
No edit summary
m (Archiving Admin Forums)
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
  +
{{ForumArchiveBox}}
{{Admin-Forumheader|Technical Help}}
 
  +
  +
[[Category:Technical Help|List of images by file size]]
  +
<div class="forumheader" style="margin-bottom: .25em;">'''Forums:''' [[Admin Central:Forum|Admin Central Index]] '''→''' [[Admin Forum:Technical Help|Technical Help]] '''→''' List of images by file size
  +
</div>
  +
<div style="text-align:center; margin-bottom: .5em; font-size:90%; border:1px solid #ccc; border-top: 0px; padding:0 4px;" >Wikia's forums are a place for the community to help other members.<br/> To contact staff directly or to report bugs, please use [[Special:Contact]].</div>
  +
   
 
<!-- Please put your content under this line. Be sure to sign your edits with four tildes: ~~~~
 
<!-- Please put your content under this line. Be sure to sign your edits with four tildes: ~~~~
 
If you require staff help, please use Special:Contact or community@wikia.com to be sure of a staff reply. -->
 
If you require staff help, please use Special:Contact or community@wikia.com to be sure of a staff reply. -->
I want to obtain a list of all images on my wiki by file size. I'm not talking about the table that is generated by [[Special:ListFiles]], which has way too much extraneous information for my purposes. I just mean a "raw" list of the names of the files, ordered from biggest to smallest, kind of like how pagegenerators.py will give you a list of things. So, something like:
+
I want to obtain a list of all images on my wiki by file size. I'm not talking about the table that is generated by [[Special:ListFiles]], which has way too much extraneous information for my purposes. I just mean a "raw" list of the names of the files, ordered from biggest to smallest, kind of like how pagegenerators.py will give you a list of things. So, something like:
 
<pre>
 
<pre>
 
File1.jpg
 
File1.jpg
Line 11: Line 17:
 
I'm not seeing an obvious way to do it through DPL; as far as I can tell "file size" isn't a parameter on offer. Anyone got any suggestions? {{user:CzechOut/Sig}}{{User:CzechOut/TimeFormat}} 15:23: Fri 21 Sep 2012</span>
 
I'm not seeing an obvious way to do it through DPL; as far as I can tell "file size" isn't a parameter on offer. Anyone got any suggestions? {{user:CzechOut/Sig}}{{User:CzechOut/TimeFormat}} 15:23: Fri 21 Sep 2012</span>
 
:From what I understand, there's no way to do that directly. Copying ListFiles into Excel is probably your best bet. You could also use the <code>allimages</code> API, but even then the files aren't listed by size. {{Signatures/Cook Me Plox}} 15:39, September 21, 2012 (UTC)
 
:From what I understand, there's no way to do that directly. Copying ListFiles into Excel is probably your best bet. You could also use the <code>allimages</code> API, but even then the files aren't listed by size. {{Signatures/Cook Me Plox}} 15:39, September 21, 2012 (UTC)
::Of course! API to the rescue! Why didn't I think of that? The answer — though perhaps an inelegant one — turned out to be http://tardis.wikia.com/api.php?action=query&list=allimages&ailimit=1000&&aiminsize=1000000&aiprop=size Then, it was just a matter of sprinkling to regex dust on the results to extract the raw title name. Now, this doesn't, in fact, sort by size, but rather alphabetically by file name. But at least it's a list of all files that are bigger than 1mb. {{user:CzechOut/Sig}}{{User:CzechOut/TimeFormat}} 14:42: Sat 22 Sep 2012</span>
+
::Of course! API to the rescue! Why didn't I think of that? The answer — though perhaps an inelegant one — turned out to be http://tardis.wikia.com/api.php?action=query&list=allimages&ailimit=1000&&aiminsize=1000000&aiprop=size Then, it was just a matter of sprinkling some regex dust on the results to extract the raw title name. Now, this doesn't, in fact, sort by size, but rather alphabetically by file name. But at least it's a list of all files that are bigger than 1mb.
  +
  +
::Hey, since I'm not really that ''au fait'' with API, I was wondering what you meant by importing into Excel? Do you have to specify a particular format that makes a better Excel import? Is it literally copying an pasting, or do you do a data import? Do you think you could take a moment to walk me through how you go from the page generated by the above API URL to a usable spreadsheet in Excel? Thanks :) {{user:CzechOut/Sig}}{{User:CzechOut/TimeFormat}} 14:42: Sat 22 Sep 2012</span>
  +
  +
:::Rather than doing this in Excel, I think this would a great thing to do on the wiki itself, that way other editors can potentially do this. I've put together a quick example of what such a tool could look like at [[w:c:mathmagician:Allimages]]. Take a look at it and let me know if that looks like something that might work for you. (Note: it's not fully built yet, it's just a working demo) {{User:Mathmagician/sig|18:19 UTC|Saturday|22 September 2012}}
  +
  +
::::That's really rather fabulous, MM. {{user:CzechOut/Sig}}{{User:CzechOut/TimeFormat}} 20:38: Sat 22 Sep 2012</span>
  +
  +
{{ri}} Cool :) -- let me know if there's any more features you want for the tool. I could just do a little more testing and basically give it to you "as-is", if all you really care about is the file size.
  +
  +
Or, I could take a day or two to build even more features into it if you think they'd be potentially helpful. Examples of features that could be added to this tool:
  +
*Ability to populate the table based on mime type. (e.g. I could add checkboxes into the interface so you can do something like "I want '''only''' videos, or, I want to search for '''only''' png's and gif's)
  +
*Ability to sort by timestamp
  +
*Ability to sort by user who uploaded the file
  +
*Ability to not only set '''minimum''' file size, but also '''maximum''' file size.
  +
*Ability to look for images that begin with a certain prefix
  +
The API can do all of these things, as I'm sure you saw when you were looking at it. It'd just be matter of building a user interface to go along with the table that allows you to conveniently set these sort of options (i.e. making it user friendly). And then packaging this tool in a way that's easy to install or copy onto other wikis in case other people wanted to use it.
  +
  +
If you do still want to know how to do this in Excel, hopefully Cook can explain that. I don't have Excel on my main computer, unfortunately, and I'm not very good with spreadsheets :P. {{User:Mathmagician/sig|21:12 UTC|Saturday|22 September 2012}}
  +
  +
:Well, all I was really looking for was a way to output the raw file names, so that I could then add a category to the page and quickly delete them. ([[w:c:tardis|Tardis]] has a "ye shall not upload bigger than 1mb images rule".) However, this tool would ''surely'' be helpful in monitoring compliance after the initial round of deletions is done, so I certainly have a use for this. I suppose I'm also interested in just kicking the tires on it, since the same basic method would likely work with other API queries.
  +
  +
:All of which is a long way of saying that I'd like the following, please:
  +
  +
:*Population by mime type
  +
:*Sorting by timestamp
  +
:*Prefix lookup
  +
:*Sorting by uploader
  +
  +
:I really am very excited by all this. Thanks for broadening my mind on the API possibilities. {{user:CzechOut/Sig}}{{User:CzechOut/TimeFormat}} 02:22: Sun 23 Sep 2012</span>
  +
  +
::Alright, this proved to be a bit more time consuming than I originally thought, but I've finally put together a v1.0 of a form for the API queries that incorporates many of these ideas. The script is at [[w:c:dev:ListFiles]], installation instructions and more can be found there! (the name is designed to remind you of [[Special:ListFiles]], which is similar but less customizable). {{User:Mathmagician/sig|08:48 UTC|Tuesday|25 September 2012}}
  +
  +
:::I'm sorry for taking over 48 hours to publicly thank you for this one, but I've been locked away doing a lot of little niggly things here and there. I freakin' love this thing. Maybe you ''are'' a genius after all ... :) {{user:CzechOut/Sig}}{{User:CzechOut/TimeFormat}} 18:44: Thu 27 Sep 2012</span>

Latest revision as of 18:31, 1 March 2013

This Forum has been archived
Forums: Admin Central Index Technical Help List of images by file size
Wikia's forums are a place for the community to help other members.
To contact staff directly or to report bugs, please use Special:Contact.


I want to obtain a list of all images on my wiki by file size. I'm not talking about the table that is generated by Special:ListFiles, which has way too much extraneous information for my purposes. I just mean a "raw" list of the names of the files, ordered from biggest to smallest, kind of like how pagegenerators.py will give you a list of things. So, something like:

File1.jpg
File2.jpg
File3.jpg

I'm not seeing an obvious way to do it through DPL; as far as I can tell "file size" isn't a parameter on offer. Anyone got any suggestions? czechout    fly tardis 15:23: Fri 21 Sep 2012

From what I understand, there's no way to do that directly. Copying ListFiles into Excel is probably your best bet. You could also use the allimages API, but even then the files aren't listed by size. ʞooɔ
15:39, September 21, 2012 (UTC)
Of course! API to the rescue! Why didn't I think of that? The answer — though perhaps an inelegant one — turned out to be http://tardis.wikia.com/api.php?action=query&list=allimages&ailimit=1000&&aiminsize=1000000&aiprop=size Then, it was just a matter of sprinkling some regex dust on the results to extract the raw title name. Now, this doesn't, in fact, sort by size, but rather alphabetically by file name. But at least it's a list of all files that are bigger than 1mb.
Hey, since I'm not really that au fait with API, I was wondering what you meant by importing into Excel? Do you have to specify a particular format that makes a better Excel import? Is it literally copying an pasting, or do you do a data import? Do you think you could take a moment to walk me through how you go from the page generated by the above API URL to a usable spreadsheet in Excel? Thanks :) czechout    fly tardis 14:42: Sat 22 Sep 2012
Rather than doing this in Excel, I think this would a great thing to do on the wiki itself, that way other editors can potentially do this. I've put together a quick example of what such a tool could look like at w:c:mathmagician:Allimages. Take a look at it and let me know if that looks like something that might work for you. (Note: it's not fully built yet, it's just a working demo) 20px_Rin_Tohsaka_Avatar.png Mathmagician ƒ(♫) 18:19 UTC, Saturday, 22 September 2012
That's really rather fabulous, MM. czechout    fly tardis 20:38: Sat 22 Sep 2012

(Reset indent) Cool :) -- let me know if there's any more features you want for the tool. I could just do a little more testing and basically give it to you "as-is", if all you really care about is the file size.

Or, I could take a day or two to build even more features into it if you think they'd be potentially helpful. Examples of features that could be added to this tool:

  • Ability to populate the table based on mime type. (e.g. I could add checkboxes into the interface so you can do something like "I want only videos, or, I want to search for only png's and gif's)
  • Ability to sort by timestamp
  • Ability to sort by user who uploaded the file
  • Ability to not only set minimum file size, but also maximum file size.
  • Ability to look for images that begin with a certain prefix

The API can do all of these things, as I'm sure you saw when you were looking at it. It'd just be matter of building a user interface to go along with the table that allows you to conveniently set these sort of options (i.e. making it user friendly). And then packaging this tool in a way that's easy to install or copy onto other wikis in case other people wanted to use it.

If you do still want to know how to do this in Excel, hopefully Cook can explain that. I don't have Excel on my main computer, unfortunately, and I'm not very good with spreadsheets :P. 20px_Rin_Tohsaka_Avatar.png Mathmagician ƒ(♫) 21:12 UTC, Saturday, 22 September 2012

Well, all I was really looking for was a way to output the raw file names, so that I could then add a category to the page and quickly delete them. (Tardis has a "ye shall not upload bigger than 1mb images rule".) However, this tool would surely be helpful in monitoring compliance after the initial round of deletions is done, so I certainly have a use for this. I suppose I'm also interested in just kicking the tires on it, since the same basic method would likely work with other API queries.
All of which is a long way of saying that I'd like the following, please:
  • Population by mime type
  • Sorting by timestamp
  • Prefix lookup
  • Sorting by uploader
I really am very excited by all this. Thanks for broadening my mind on the API possibilities. czechout    fly tardis 02:22: Sun 23 Sep 2012
Alright, this proved to be a bit more time consuming than I originally thought, but I've finally put together a v1.0 of a form for the API queries that incorporates many of these ideas. The script is at w:c:dev:ListFiles, installation instructions and more can be found there! (the name is designed to remind you of Special:ListFiles, which is similar but less customizable). 20px_Rin_Tohsaka_Avatar.png Mathmagician ƒ(♫) 08:48 UTC, Tuesday, 25 September 2012
I'm sorry for taking over 48 hours to publicly thank you for this one, but I've been locked away doing a lot of little niggly things here and there. I freakin' love this thing. Maybe you are a genius after all ... :) czechout    fly tardis 18:44: Thu 27 Sep 2012