Commons:Batch uploading/Fortepan.HU

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Example high resolution scanned photograph of mixed bathers in Opatija, Croatia, 1910.
  • Describe the works to be uploaded in detail (audio files, images by …):

43671 photos from Hungarian authors from 1900 to 1990 showing mostly landscapes and portraits.

  • Which license tag(s) should be applied?

CC-BY-SA, as licensed (if trusted)

  • Is there a template that could be used on the file description pages? Do you think a special template should be created?

At least a hidden category for all unmodified photos, and well organized data in the |source= field of {{Information}}.

Some images already at Category:Images from Fortepan were uploaded to Commons as early as 2012; the most recent uploads are from last june. -- Tuválkin 01:22, 1 December 2015 (UTC)[reply]

Moreover, all photos seem to include date-taken, author name, location, a suitable short description (in Hungarian) and some sensible keywords/categopries (in Hungarian and English) — all this info should be also imported and integrated in Commons.

Filenames of unmodified photos should retain the number ID (maybe as a suffix at the end of a human readable filename in the form of (Fortepan.HU-123456) or similar), to enable quick checking. -- Tuválkin 02:07, 19 December 2014 (UTC)[reply]

Deletion requests

[edit]
'Fortepan' as author
[edit]

These two DRs were raised on the basis that the author is given as 'Fortepan' rather than a named photographer. As both are over 70 years old, they are likely to be Public Domain anyway. However, they raise the issue that photographs with no named donor may be deleted unless the process for validating releases on Fortepan has a better explanation.

The upload is now being filtered to skip where author is "Fortepan", unless:
→ the date is 1945 or earlier.
→→ In these cases the license is being set as {{PD-anon-70-EU}} + {{PD-anon-auto-1996}} with the author as 'Unknown'
→→ if the date is 1922 or earlier then the simpler {{PD-anon-70-EU}} + {{PD-1923}} is used

At the time of writing, 8,383 photographs are listed against "Fortepan" at the source, it is likely that only a minority will be dated before 1945. See PetScan listing of Public Domain Fortepan uploads.

If there is a named donor, but the date is before 1916, i.e. the photograph is over 100 years old, then the default of {{CC-BY-SA-3.0}} is used with {{PD-1923}}; giving a good margin of error, and recognizing that the named donor is logically not going to be the photographer. Matching search.

Brigitte Bardot glamour poster
[edit]

DW, not de miniminis: Commons:Deletion_requests/File:B.B._Fortepan_6511.jpg. -- Tuválkin 04:54, 1 August 2016 (UTC)[reply]

Photos from Magyar Rendőr
[edit]

No authorization to license was given to or by Magyar Rendőr: Commons:Deletion requests/Photos from Magyar Rendőr.   — Jeff G. ツ 14:54, 6 June 2017 (UTC)[reply]

More
[edit]

GLAM dashboard reports

[edit]

Opinions

[edit]

These files should all be deleted per User:Grin/fortepan.hu. It is clear that Fortepan is perpetrating copyfraud on a massive scale. There are currently 69,164 transclusions of Template:Fortepan, none of which have believable licenses.   — Jeff G. ツ 14:49, 6 June 2017 (UTC)[reply]

You are over-egging the case. Please avoid making dramatic statements that are not supported by the facts. My uploads use different licenses depending on the asserted facts, such as being anonymous, taken before 1923 or taken before 1946. I suggest you look more carefully at those different case types before arguing that everything must be deleted. Even if the uploader had no rights over an image, if there is no known photographer and it is over 70 years old, it is public domain. Thanks -- (talk) 15:01, 6 June 2017 (UTC)[reply]
Note that the dates 1923 or 1946 have no legal relevance since these photos have not been first published in the US. Instead, an image can be hosted on Commons if
  • a free license was granted to Fortepan (which does happen - their irresponsible copyright practices notwithstanding, they are a major archive in Hungary and do receive legitimate image donations)
  • the photographer has died before 1947 and the image has been published before 1992
  • the photographer is unknown and the image has been published before 1943
Sadly these tend to be hard to determine and the information provided by Fortepan is rarely detailed enough to be useful. --Tgr (talk) 14:25, 9 June 2017 (UTC)[reply]
Assigned to Progress Bot name Category
Started 27 May 2016
  • ~69,000 images, approx 560 already on Commons.
  • IDs go to 101938 at the moment, though there seems a backlog process as not all of these yet have description pages (give "No result!" rather than display the photo). Likely that the backlog is run by date-in, and Commons will therefore only be able to take those with webpage records.
  • Downloading straightforward, parsing of html records will be a reasonably simple mapping. Tags (keywords) may or may not be useful for category mappings. I'll probably not do them automatically at upload.
  • Descriptions will be Hungarian, there does not appear to be alternative languages. No true content difference between en or hu webpage layouts.

Created Commons:Batch uploading/Fortepan.HU/Reports, populated by the GLAM dashboard.

Titles and Descriptions may not exist, so a fallback scheme of title -> tag_place -> tags -> year is being followed. Examples:

Upload rate is about 30 images per hour (~2 MB/min), barring glitches. The image rate will slow down proportionately for large file sizes.

Updates from 3 June:

  • Colour photographs from Fortepan is added automatically if tag:colorful is detected.
  • Attribution for CC-BY-SA explicitly set as "FOTO:FORTEPAN / <contributor>".
  • 4 photographs have been found with visible Swastikas on armbands or flags (search). For regional legal purposes the warning template {{swastika}} is being added automatically when tag:swastica is detected.

Update 18 June:

  • Corruption detector has been created for apparent partially downloaded files. Problem files are added to Category:Images uploaded by Fæ (reload needed). These rare instances can be fixed by re-uploading though it is possible that the file is corrupt at source.

Unfixed bugs:

  1. There is a pattern of an erratic double upload problem, apparently where there is a lag for the upload process to see that the upload succeeded. This is not damaging, but it does mean two (or more) copies are in the history and bandwidth is wasted by uploading the duplicate.
  2. A small number of files drop-out after upload to the Wikimedia servers with "API error stashfailed". These may be caused by temporary issues or by errors with the uploaded file. As these are small numbers, reupload is not attempted. A later re-run using the same code may be able to upload most of these files.
Completed
September 2016.
N/A 66,320 R Images from Fortepan

Backlog for category checks: Catscan3 list