The most requested media files per referer and per media type.

wx_most_requested_files(
  referer,
  media_type = c("all", "image", "video", "audio", "document", "other"),
  granularity = c("daily", "monthly"),
  start_date = "20191231",
  end_date = "20200101"
)

Arguments

referer

The place that the request was made from. It can be any Wikimedia project (e.g. de.wikipedia), "all", "internal", "external", "search-engine", "unknown" or "none".

media_type

The media type that each file belongs to. It can be image, audio, video, document, or other. Default is "all" media types. See Media section below.

granularity

The time unit for the response data. As of today, supported values are daily (default) and monthly.

start_date

The date of the first day to include, in YYYYMMDD format. Can also be a Date or a POSIXt object, which will be auto-formatted.

end_date

The date of the last day to include, in YYYYMMDD format. Can also be a Date or a POSIXt object, which will be auto-formatted.

Value

A tibble data frame with the following columns:

date

Date; beginning of each month if granularity = "monthly"

referer

referer

file_path

the media file's path on upload.wikimedia.org

requests

number of requests

rank

requests-based ranking

Media

File types are obtained by parsing the file extension, and then classified according to the following table:

ExtensionsMedia type
svg, png, tiff, tiff, jpeg, jpeg, gif, xcf, webp, bmpimage
mp3, ogg, oga, flac, wav, midi, midiaudio
webm, ogvvideo
pdf, djvu, srt, txtdocument
(all other extensions)other

Limitations

  • The ability of splitting and filtering by referrer is limited to data from May 2019 onward. Before that, referrer is only split in internal, external, and unknown.

  • The beginning of mediarequest data is the 1st of January 2015.

  • The ability of splitting and filtering by agent type (user, spider) is limited to data from May 2019 onward.

  • About 0.7% of mediarequests are prefetches coming from Media Viewer (more details in wikitech:Analytics/AQS/Media_metrics)

License

Data retrieved from the API endpoint is available under the CC0 1.0 license.

See also

Examples

wx_most_requested_files("en.wikipedia", media_type = "video", granularity = "monthly")
#> # A tibble: 2,000 x 5 #> date referer file_path requests rank #> <date> <chr> <chr> <int> <int> #> 1 2019-12-01 en.wikipe… /wikipedia/commons/c/c2/December_18_199… 389645 1 #> 2 2019-12-01 en.wikipe… /wikipedia/commons/4/40/WASPS-Women-Air… 358472 2 #> 3 2019-12-01 en.wikipe… /wikipedia/commons/0/06/Response_to_the… 288922 3 #> 4 2019-12-01 en.wikipe… /wikipedia/commons/2/2b/Women_at_NASA-n… 244143 4 #> 5 2019-12-01 en.wikipe… /wikipedia/commons/c/c6/Greta_Thunberg-… 235248 5 #> 6 2019-12-01 en.wikipe… /wikipedia/commons/5/56/Secret_Santa_Vi… 229283 6 #> 7 2019-12-01 en.wikipe… /wikipedia/commons/b/b6/President_Trump… 214381 7 #> 8 2019-12-01 en.wikipe… /wikipedia/commons/1/1c/Wikipedia-Video… 210783 8 #> 9 2019-12-01 en.wikipe… /wikipedia/commons/3/3d/Epstein_Body_Mo… 200499 9 #> 10 2019-12-01 en.wikipe… /wikipedia/commons/1/14/President_Trump… 196129 10 #> # … with 1,990 more rows