Discussion:
Wikimedia Strategy
(too old to reply)
Andrea Zanni
2017-03-19 17:10:47 UTC
Permalink
Raw Message
Dear all,
as you probably have heard, a process for writing the strategy of Wikimedia
has started in these days.
It's a complex and collective process, and if you are confused, don't be:
everyone is ;-)

Conversations are starting to pop everywhere on Meta, on Wikipedias, on
Wikisources, probably even on Facebook.

Here you can find a briefing, an initial overview of potential topics that
may come up across various strategy conversations. I suggest you give it a
look to understand the scope of this whole plan:
https://meta.wikimedia.org/wiki/Strategy/Wikimedia_movement/2017/Process/Briefing

The question we are asked to answer is this:
***What do we want to build or achieve together over the next 15 years?***

I'd like you to go back in your community and join (or start) this
conversation,
but also share *here* some of your insights and opinions.
We'll polish these thoughts afterwards: this is the time of speaking your
mind and dream big.

Aubrey
Asaf Bartov
2017-03-19 20:44:41 UTC
Permalink
Raw Message
To my mind, the ~15-year focus invites us to think big (i.e. not a feature
here or there, but to imagine Wikimedia's role in the world in 2030, and in
our context here, what Wikisource might be within that role).

This, in turn, brings me back to a point I brought up in Vienna in 2015:
Wikisource's identity question, vis-a-vis other digital libraries. In
particular, assuming not just business-as-usual in coming years (i.e.
Project Gutenberg adding more books), but also obvious and long-awaited
developments like national libraries becoming more serious and more
effective in digitizing *and making accessible* their out-of-copyright
collections. In such a world, what might Wikisource's unique value be?

My own answer, in line with our Vienna answer to the identity question, is
that it is our human curation and meticulous attention to detail that sets
our project apart from other (better funded and larger-scale) digitization
efforts. We are able to create high quality, hyperlinked (and
semantically-linked, i.e. Wikidata) metadata to describe the texts we
produce.

If we accept this line of reasoning, what might be the significant role our
unique advantage might play in 15 years? What might we work towards to get
there? I don't have a clear vision, myself, but I have a strong
intuition/belief that it is to do with our curation and metadata
production, more than with our raw transcription production. This would
imply a fairly radical shift, in both labor and technological attention,
and I am not at all sure the Wikisource communities are interested or ready
to make such a change. I have sketched one example of the immense value
our volunteer communities might produce with our parallel and multilingual
volunteer labor in The Aboutness Project and the Table of Contents of
Everything project, documented here:
https://meta.wikimedia.org/wiki/Massively-Multiplayer_Online_Bibliography
(which I have alas not made progress on in the last year.)

I'd be very interested to hear other opinions about the future I painted
above, or other futures you see vis-a-vis Wikisource with a ~15-year
perspective.

Cheers,

A.
(volunteer hat)
Post by Andrea Zanni
Dear all,
as you probably have heard, a process for writing the strategy of
Wikimedia has started in these days.
everyone is ;-)
Conversations are starting to pop everywhere on Meta, on Wikipedias, on
Wikisources, probably even on Facebook.
Here you can find a briefing, an initial overview of potential topics that
may come up across various strategy conversations. I suggest you give it a
https://meta.wikimedia.org/wiki/Strategy/Wikimedia_movement/2017/Process/Briefing
***What do we want to build or achieve together over the next 15 years?***
I'd like you to go back in your community and join (or start) this
conversation,
but also share *here* some of your insights and opinions.
We'll polish these thoughts afterwards: this is the time of speaking your
mind and dream big.
Aubrey
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Andrea Zanni
2017-03-19 23:30:11 UTC
Permalink
Raw Message
I do happen to agree a lot with you, Asaf.

I do think wikisource mainly as a library, and then a place where we
transcribe books.
It happens that books are paper-based, and that we want the text to be
available, searchable and readable. We want to create books and texts for
people to read and use.

"Books are for use" is the first law of Library Science, developed by
Ranaganathan. [1]

What Wikisource do and can do is to make texts more accessible, linking
them with authors, other texts, maybe in the future even other Wikipedia
articles, or places on OpenStreetMap.
We can make the entire literature a place like Wikipedia: an interwoven,
intertwingled structure of texts and data and links.

It really strucks me that, historically, we have mainly two metaphors for
"the sum of human knowledge": the encylopedia, and the library.

The encyclopedia is a single work, with a neutral point of view on "facts",
and we are trying to achieve that with Wikipedia.

The library is a much more complex "object", full of contradictory books
and views and interpretations.
What I'd love to do is a Wikimedia "universe" that goes beyond the
encyclopedic metaphor, and embrace the idea of a more rich galaxy of
connected projects, which provide everything: NPOV articles, free books,
OERs, media, data, and maybe, in the future, other ways of representing
knowledge and comments and opinions of knowledge.

We have yet to tap the idea of letting people comment, customize and
personalize our content for studying and learning, annotating, sharing and
creating educational material directly on our websites.
There will be probably time, but we must recognize we are just at the
beginning.

Aubrey

[1] https://en.wikipedia.org/wiki/Five_laws_of_library_science
Post by Asaf Bartov
To my mind, the ~15-year focus invites us to think big (i.e. not a feature
here or there, but to imagine Wikimedia's role in the world in 2030, and in
our context here, what Wikisource might be within that role).
Wikisource's identity question, vis-a-vis other digital libraries. In
particular, assuming not just business-as-usual in coming years (i.e.
Project Gutenberg adding more books), but also obvious and long-awaited
developments like national libraries becoming more serious and more
effective in digitizing *and making accessible* their out-of-copyright
collections. In such a world, what might Wikisource's unique value be?
My own answer, in line with our Vienna answer to the identity question, is
that it is our human curation and meticulous attention to detail that sets
our project apart from other (better funded and larger-scale) digitization
efforts. We are able to create high quality, hyperlinked (and
semantically-linked, i.e. Wikidata) metadata to describe the texts we
produce.
If we accept this line of reasoning, what might be the significant role
our unique advantage might play in 15 years? What might we work towards to
get there? I don't have a clear vision, myself, but I have a strong
intuition/belief that it is to do with our curation and metadata
production, more than with our raw transcription production. This would
imply a fairly radical shift, in both labor and technological attention,
and I am not at all sure the Wikisource communities are interested or ready
to make such a change. I have sketched one example of the immense value
our volunteer communities might produce with our parallel and multilingual
volunteer labor in The Aboutness Project and the Table of Contents of
https://meta.wikimedia.org/wiki/Massively-Multiplayer_Online_Bibliography
(which I have alas not made progress on in the last year.)
I'd be very interested to hear other opinions about the future I painted
above, or other futures you see vis-a-vis Wikisource with a ~15-year
perspective.
Cheers,
A.
(volunteer hat)
Post by Andrea Zanni
Dear all,
as you probably have heard, a process for writing the strategy of
Wikimedia has started in these days.
everyone is ;-)
Conversations are starting to pop everywhere on Meta, on Wikipedias, on
Wikisources, probably even on Facebook.
Here you can find a briefing, an initial overview of potential topics
that may come up across various strategy conversations. I suggest you give
https://meta.wikimedia.org/wiki/Strategy/Wikimedia_
movement/2017/Process/Briefing
***What do we want to build or achieve together over the next 15 years?***
I'd like you to go back in your community and join (or start) this
conversation,
but also share *here* some of your insights and opinions.
We'll polish these thoughts afterwards: this is the time of speaking your
mind and dream big.
Aubrey
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Federico Leva (Nemo)
2017-03-20 00:01:44 UTC
Permalink
Raw Message
Post by Asaf Bartov
In such a world, what might Wikisource's unique value be?
https://strategy.wikimedia.org/wiki/Proposal:Make_Wikisource_scale

Nemo
David Cuenca Tudela
2017-03-20 14:48:15 UTC
Permalink
Raw Message
what might be the significant role our unique advantage might play in 15
years?
There are some circumstantial aspects that might be relevant for Wikisource:
- With the emergence of machine learning, do volunteers really need to
spend so much time formatting? Or will we able to use our data to train a
system to do some pre-formatting for us?
- With the existing flood of data, can we consider ws as a relevancy
setter? If a document has been transcribed/imported into wikisource, is
that enough to make the document relevant?
- Considering that not all libraries might have the resources to develop
their own platform, can Wikisource be used as a neutral platform by
external agents as a complement to their own infrastructure?

Regarding the 15 years time frame, it might be a good exercise to examine
different scenarios. Yes, one could be to think big, to expect growth and a
favorable environment. But what about the opposite? What if there are
*less* people able to contribute?

Cheers,
Micru
Pine W
2017-03-20 19:14:52 UTC
Permalink
Raw Message
Glad to see this discussion. Pinging Alex Stinson for this discussion in
case he has any insights to add from a GLAM perspective.

Pine
Post by David Cuenca Tudela
what might be the significant role our unique advantage might play in 15
years?
- With the emergence of machine learning, do volunteers really need to
spend so much time formatting? Or will we able to use our data to train a
system to do some pre-formatting for us?
- With the existing flood of data, can we consider ws as a relevancy
setter? If a document has been transcribed/imported into wikisource, is
that enough to make the document relevant?
- Considering that not all libraries might have the resources to develop
their own platform, can Wikisource be used as a neutral platform by
external agents as a complement to their own infrastructure?
Regarding the 15 years time frame, it might be a good exercise to examine
different scenarios. Yes, one could be to think big, to expect growth and a
favorable environment. But what about the opposite? What if there are
*less* people able to contribute?
Cheers,
Micru
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Andrea Zanni
2017-03-20 21:54:09 UTC
Permalink
Raw Message
@Micru: of course, as you say, machine learning is the elephant in the room.
I dream of something we could call "Wikisource as a platform":
meaning an environment with structured data and workflows where you can
have APIs
and tools for interact with humans and machines, both for input and for
output.
We could have OCR software that learn from our human proofreaders, and
ideally we could
even have OCRs tailored for determined centuries or types of books.
We could ue machine learning to look for citations within books (for
example other cited books or authors).¹
This could improve heavily our library:
on Internet Archive or Google Books we have millions of books that just
wait for us to make them
readable and accessible, and, of course, connect them to Wikipedia, to
Wikidata, to other Wikisource books.

IMHO, this is obviously important for GLAMs:
we could be much more usable and easy for libraries, archives and museums
that want to upload into Wikisource their texts and books, and make them
part of our hyperlinked library.
They could import easily on Wikisource, and could export as well.
Now, this is impossible or at least very very difficult.²

I'm not sure that all these features could go in just one project, but it's
probably worth trying.

Aubrey

[1] I remember I explored the idea with Amir, but I couldn't follow up.
[2] To get all the data I needed from Wikisource books, I had to basically
scrape the website.
Post by Pine W
Glad to see this discussion. Pinging Alex Stinson for this discussion in
case he has any insights to add from a GLAM perspective.
Pine
Post by David Cuenca Tudela
what might be the significant role our unique advantage might play in 15
years?
- With the emergence of machine learning, do volunteers really need to
spend so much time formatting? Or will we able to use our data to train a
system to do some pre-formatting for us?
- With the existing flood of data, can we consider ws as a relevancy
setter? If a document has been transcribed/imported into wikisource, is
that enough to make the document relevant?
- Considering that not all libraries might have the resources to develop
their own platform, can Wikisource be used as a neutral platform by
external agents as a complement to their own infrastructure?
Regarding the 15 years time frame, it might be a good exercise to examine
different scenarios. Yes, one could be to think big, to expect growth and a
favorable environment. But what about the opposite? What if there are
*less* people able to contribute?
Cheers,
Micru
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Andrea Zanni
2017-03-24 09:50:42 UTC
Permalink
Raw Message
Anyone else?
It would be very good to know the gist of the discussions/opinions you are
having in your local Wikisource.

The Italian Wikisource for example is summing this up here:
https://meta.wikimedia.org/wiki/Strategy/Wikimedia_movement/2017/Sources/Italian_Wikisource_Village_pump

For us, there is a bit of a disagreement about the idea and goal of being a
"library", and being a "typography": being a library is more focused on
access, on services build upon texts (text analysis, text mining,
searching, hyperlinking, annotation) and the transcribing/proofreading
part, which needs a whole different level of tools and interface.

Maybe you are having a similar discussion?
Do you possibly see a "fork", in the future, of Wikisource in 2 different
projects, or at least 2 different interfaces?

Aubrey
Post by Andrea Zanni
@Micru: of course, as you say, machine learning is the elephant in the room.
meaning an environment with structured data and workflows where you can
have APIs
and tools for interact with humans and machines, both for input and for
output.
We could have OCR software that learn from our human proofreaders, and
ideally we could
even have OCRs tailored for determined centuries or types of books.
We could ue machine learning to look for citations within books (for
example other cited books or authors).¹
on Internet Archive or Google Books we have millions of books that just
wait for us to make them
readable and accessible, and, of course, connect them to Wikipedia, to
Wikidata, to other Wikisource books.
we could be much more usable and easy for libraries, archives and museums
that want to upload into Wikisource their texts and books, and make them
part of our hyperlinked library.
They could import easily on Wikisource, and could export as well.
Now, this is impossible or at least very very difficult.²
I'm not sure that all these features could go in just one project, but
it's probably worth trying.
Aubrey
[1] I remember I explored the idea with Amir, but I couldn't follow up.
[2] To get all the data I needed from Wikisource books, I had to basically
scrape the website.
Post by Pine W
Glad to see this discussion. Pinging Alex Stinson for this discussion in
case he has any insights to add from a GLAM perspective.
Pine
Post by David Cuenca Tudela
what might be the significant role our unique advantage might play in
15 years?
- With the emergence of machine learning, do volunteers really need to
spend so much time formatting? Or will we able to use our data to train a
system to do some pre-formatting for us?
- With the existing flood of data, can we consider ws as a relevancy
setter? If a document has been transcribed/imported into wikisource, is
that enough to make the document relevant?
- Considering that not all libraries might have the resources to develop
their own platform, can Wikisource be used as a neutral platform by
external agents as a complement to their own infrastructure?
Regarding the 15 years time frame, it might be a good exercise to
examine different scenarios. Yes, one could be to think big, to expect
growth and a favorable environment. But what about the opposite? What if
there are *less* people able to contribute?
Cheers,
Micru
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Andrea Zanni
2017-03-28 06:36:56 UTC
Permalink
Raw Message
Another thing I would be very happy to see in the future is a greater,
systematic collaboration with Internet Archive.
I'm convinced that it's a vital part of our ecosystem, because it allow
easily a lot of things that should be done by skilled users (like create a
PDF/djvu, OCR, etc).
When a I explain Wikisource I always explain Internet Archive first,
teaching people to upload there their files, then into Commons/Wikisource
via the "IA Upload" tool.

This is why the Italian Wikisource community created a dedicated collection
on IA:
https://archive.org/details/itwikisource

To create a collection, you need at least 50 items, and then you can ask
Internet Archive to give you permission.
Right now, Alex brollo is writing some scripts that will allow a better
maintenance of the metadata,
we'll share them when they are ready.

If you create a collection, please tell us: we could even have a greater
"Wikisource" collection, that contains all the linguistic collections.

Maybe this is a bit OT for the strategy, but I think it suggests way to
improve the collaboration between us and IA.
Post by Andrea Zanni
Anyone else?
It would be very good to know the gist of the discussions/opinions you are
having in your local Wikisource.
https://meta.wikimedia.org/wiki/Strategy/Wikimedia_
movement/2017/Sources/Italian_Wikisource_Village_pump
For us, there is a bit of a disagreement about the idea and goal of being
a "library", and being a "typography": being a library is more focused on
access, on services build upon texts (text analysis, text mining,
searching, hyperlinking, annotation) and the transcribing/proofreading
part, which needs a whole different level of tools and interface.
Maybe you are having a similar discussion?
Do you possibly see a "fork", in the future, of Wikisource in 2 different
projects, or at least 2 different interfaces?
Aubrey
Post by Andrea Zanni
@Micru: of course, as you say, machine learning is the elephant in the room.
meaning an environment with structured data and workflows where you can
have APIs
and tools for interact with humans and machines, both for input and for
output.
We could have OCR software that learn from our human proofreaders, and
ideally we could
even have OCRs tailored for determined centuries or types of books.
We could ue machine learning to look for citations within books (for
example other cited books or authors).¹
on Internet Archive or Google Books we have millions of books that just
wait for us to make them
readable and accessible, and, of course, connect them to Wikipedia, to
Wikidata, to other Wikisource books.
we could be much more usable and easy for libraries, archives and museums
that want to upload into Wikisource their texts and books, and make them
part of our hyperlinked library.
They could import easily on Wikisource, and could export as well.
Now, this is impossible or at least very very difficult.²
I'm not sure that all these features could go in just one project, but
it's probably worth trying.
Aubrey
[1] I remember I explored the idea with Amir, but I couldn't follow up.
[2] To get all the data I needed from Wikisource books, I had to
basically scrape the website.
Post by Pine W
Glad to see this discussion. Pinging Alex Stinson for this discussion in
case he has any insights to add from a GLAM perspective.
Pine
Post by David Cuenca Tudela
what might be the significant role our unique advantage might play in
15 years?
- With the emergence of machine learning, do volunteers really need to
spend so much time formatting? Or will we able to use our data to train a
system to do some pre-formatting for us?
- With the existing flood of data, can we consider ws as a relevancy
setter? If a document has been transcribed/imported into wikisource, is
that enough to make the document relevant?
- Considering that not all libraries might have the resources to
develop their own platform, can Wikisource be used as a neutral platform by
external agents as a complement to their own infrastructure?
Regarding the 15 years time frame, it might be a good exercise to
examine different scenarios. Yes, one could be to think big, to expect
growth and a favorable environment. But what about the opposite? What if
there are *less* people able to contribute?
Cheers,
Micru
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Federico Leva (Nemo)
2017-03-29 06:30:03 UTC
Permalink
Raw Message
One issue sometimes raised about Wikisource is how we know that we're
working on the "right" books. Internet Archive is planning to textbooks
starting from those which are most frequently assigned in USA schools:
http://blog.archive.org/2017/03/29/books-donated-for-macarthur-foundation-100change-challenge-from-bookmooch-users/

I was surprised to learn a project like OpenSyllabus exists and works, I
emailed them to ask what it would take to do the same for other
languages/geographies.

Nemo
mathieu stumpf guntz
2017-04-11 07:42:54 UTC
Permalink
Raw Message
Hi Nemo,

We may establish a list a the "1000 works that every Wikisource should
have" (with translation possibly needed).

What metric could we use to define such a list? Maybe reference
frequency, but it requires statistics whose availability is unknown to me.

Statistically,
psychoslave
Post by Federico Leva (Nemo)
One issue sometimes raised about Wikisource is how we know that we're
working on the "right" books. Internet Archive is planning to
textbooks starting from those which are most frequently assigned in
http://blog.archive.org/2017/03/29/books-donated-for-macarthur-foundation-100change-challenge-from-bookmooch-users/
I was surprised to learn a project like OpenSyllabus exists and works,
I emailed them to ask what it would take to do the same for other
languages/geographies.
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Thomas PT
2017-04-11 08:44:00 UTC
Permalink
Raw Message
A maybe simpler metric: the top 1000 Wikipedia articles about works per page view.

Thomas
Post by mathieu stumpf guntz
Hi Nemo,
We may establish a list a the "1000 works that every Wikisource should have" (with translation possibly needed).
What metric could we use to define such a list? Maybe reference frequency, but it requires statistics whose availability is unknown to me.
Statistically,
psychoslave
Post by Federico Leva (Nemo)
http://blog.archive.org/2017/03/29/books-donated-for-macarthur-foundation-100change-challenge-from-bookmooch-users/
I was surprised to learn a project like OpenSyllabus exists and works, I emailed them to ask what it would take to do the same for other languages/geographies.
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Jane Darnell
2017-04-11 09:50:44 UTC
Permalink
Raw Message
You can always start with the lists per country (if they exist). So for
example I made an article about the first 500 of such a "1000 most
important works of literature" list compiled for the Netherlands here:
https://en.wikipedia.org/wiki/Canon_of_Dutch_Literature
Post by Thomas PT
A maybe simpler metric: the top 1000 Wikipedia articles about works per page view.
Thomas
Le 11 avr. 2017 à 09:42, mathieu stumpf guntz <
Hi Nemo,
We may establish a list a the "1000 works that every Wikisource should
have" (with translation possibly needed).
What metric could we use to define such a list? Maybe reference
frequency, but it requires statistics whose availability is unknown to me.
Statistically,
psychoslave
Post by Federico Leva (Nemo)
One issue sometimes raised about Wikisource is how we know that we're
working on the "right" books. Internet Archive is planning to textbooks
Post by Federico Leva (Nemo)
http://blog.archive.org/2017/03/29/books-donated-for-
macarthur-foundation-100change-challenge-from-bookmooch-users/
Post by Federico Leva (Nemo)
I was surprised to learn a project like OpenSyllabus exists and works,
I emailed them to ask what it would take to do the same for other
languages/geographies.
Post by Federico Leva (Nemo)
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Andrea Zanni
2017-04-11 09:57:46 UTC
Permalink
Raw Message
In it.source we made a similar Canon:
https://it.wikisource.org/wiki/Wikisource:Canone_delle_opere_della_letteratura_italiana

Ideally, we should have an item (a "work" item, so basically the one with a
Wikipedia article) on Wikidata for each one.
Than we can count how many Wikipedias have an article on it. Basically it's
Tpt's idea using wikidata and sitelinks.

Aubrey
Post by Jane Darnell
You can always start with the lists per country (if they exist). So for
example I made an article about the first 500 of such a "1000 most
https://en.wikipedia.org/wiki/Canon_of_Dutch_Literature
Post by Thomas PT
A maybe simpler metric: the top 1000 Wikipedia articles about works per page view.
Thomas
Le 11 avr. 2017 à 09:42, mathieu stumpf guntz <
Hi Nemo,
We may establish a list a the "1000 works that every Wikisource should
have" (with translation possibly needed).
What metric could we use to define such a list? Maybe reference
frequency, but it requires statistics whose availability is unknown to me.
Statistically,
psychoslave
Post by Federico Leva (Nemo)
One issue sometimes raised about Wikisource is how we know that we're
working on the "right" books. Internet Archive is planning to textbooks
Post by Federico Leva (Nemo)
http://blog.archive.org/2017/03/29/books-donated-for-macarth
ur-foundation-100change-challenge-from-bookmooch-users/
Post by Federico Leva (Nemo)
I was surprised to learn a project like OpenSyllabus exists and works,
I emailed them to ask what it would take to do the same for other
languages/geographies.
Post by Federico Leva (Nemo)
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Magnus Manske
2017-04-11 10:23:30 UTC
Permalink
Raw Message
The 500 most important (as in, number of Wiki sitelinks) literary works
that are (at least partially) in "original language" German, according to
Wikidata:
http://tinyurl.com/mzhd8na
"The Big Bang Theory" item might need some review, but the rest look good...
Just change the Q188 and the language code for your favourite language!
Post by Andrea Zanni
https://it.wikisource.org/wiki/Wikisource:Canone_delle_opere_della_letteratura_italiana
Ideally, we should have an item (a "work" item, so basically the one with
a Wikipedia article) on Wikidata for each one.
Than we can count how many Wikipedias have an article on it. Basically
it's Tpt's idea using wikidata and sitelinks.
Aubrey
You can always start with the lists per country (if they exist). So for
example I made an article about the first 500 of such a "1000 most
https://en.wikipedia.org/wiki/Canon_of_Dutch_Literature
A maybe simpler metric: the top 1000 Wikipedia articles about works per page view.
Thomas
Le 11 avr. 2017 à 09:42, mathieu stumpf guntz <
Hi Nemo,
We may establish a list a the "1000 works that every Wikisource should
have" (with translation possibly needed).
What metric could we use to define such a list? Maybe reference
frequency, but it requires statistics whose availability is unknown to me.
Statistically,
psychoslave
Post by Federico Leva (Nemo)
One issue sometimes raised about Wikisource is how we know that we're
working on the "right" books. Internet Archive is planning to textbooks
http://blog.archive.org/2017/03/29/books-donated-for-macarthur-foundation-100change-challenge-from-bookmooch-users/
Post by Federico Leva (Nemo)
I was surprised to learn a project like OpenSyllabus exists and works,
I emailed them to ask what it would take to do the same for other
languages/geographies.
Post by Federico Leva (Nemo)
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Carles Paredes Lanau
2017-04-11 10:27:55 UTC
Permalink
Raw Message
Ca.source has a similar Canon list:

https://ca.wikisource.org/wiki/Viquitexts:Els_50_essencials_de_la_llengua_catalana

In my opinion, it's too hard to make a consensus multilingual list.
Post by Magnus Manske
The 500 most important (as in, number of Wiki sitelinks) literary works
that are (at least partially) in "original language" German, according to
http://tinyurl.com/mzhd8na
"The Big Bang Theory" item might need some review, but the rest look good...
Just change the Q188 and the language code for your favourite language!
Post by Andrea Zanni
https://it.wikisource.org/wiki/Wikisource:Canone_delle_
opere_della_letteratura_italiana
Ideally, we should have an item (a "work" item, so basically the one with
a Wikipedia article) on Wikidata for each one.
Than we can count how many Wikipedias have an article on it. Basically
it's Tpt's idea using wikidata and sitelinks.
Aubrey
You can always start with the lists per country (if they exist). So for
example I made an article about the first 500 of such a "1000 most
https://en.wikipedia.org/wiki/Canon_of_Dutch_Literature
A maybe simpler metric: the top 1000 Wikipedia articles about works per page view.
Thomas
Le 11 avr. 2017 à 09:42, mathieu stumpf guntz <
Hi Nemo,
We may establish a list a the "1000 works that every Wikisource should
have" (with translation possibly needed).
What metric could we use to define such a list? Maybe reference
frequency, but it requires statistics whose availability is unknown to me.
Statistically,
psychoslave
Post by Federico Leva (Nemo)
One issue sometimes raised about Wikisource is how we know that we're
working on the "right" books. Internet Archive is planning to textbooks
Post by Federico Leva (Nemo)
http://blog.archive.org/2017/03/29/books-donated-for-
macarthur-foundation-100change-challenge-from-bookmooch-users/
Post by Federico Leva (Nemo)
I was surprised to learn a project like OpenSyllabus exists and works,
I emailed them to ask what it would take to do the same for other
languages/geographies.
Post by Federico Leva (Nemo)
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Jane Darnell
2017-04-11 10:44:15 UTC
Permalink
Raw Message
Interesting query, thanks! How odd that "sitcom" is a subclass of "literary
work"! I never thought of it that way :)
Post by Magnus Manske
The 500 most important (as in, number of Wiki sitelinks) literary works
that are (at least partially) in "original language" German, according to
http://tinyurl.com/mzhd8na
"The Big Bang Theory" item might need some review, but the rest look good...
Just change the Q188 and the language code for your favourite language!
Post by Andrea Zanni
https://it.wikisource.org/wiki/Wikisource:Canone_delle_
opere_della_letteratura_italiana
Ideally, we should have an item (a "work" item, so basically the one with
a Wikipedia article) on Wikidata for each one.
Than we can count how many Wikipedias have an article on it. Basically
it's Tpt's idea using wikidata and sitelinks.
Aubrey
You can always start with the lists per country (if they exist). So for
example I made an article about the first 500 of such a "1000 most
https://en.wikipedia.org/wiki/Canon_of_Dutch_Literature
A maybe simpler metric: the top 1000 Wikipedia articles about works per page view.
Thomas
Le 11 avr. 2017 à 09:42, mathieu stumpf guntz <
Hi Nemo,
We may establish a list a the "1000 works that every Wikisource should
have" (with translation possibly needed).
What metric could we use to define such a list? Maybe reference
frequency, but it requires statistics whose availability is unknown to me.
Statistically,
psychoslave
Post by Federico Leva (Nemo)
One issue sometimes raised about Wikisource is how we know that we're
working on the "right" books. Internet Archive is planning to textbooks
Post by Federico Leva (Nemo)
http://blog.archive.org/2017/03/29/books-donated-for-
macarthur-foundation-100change-challenge-from-bookmooch-users/
Post by Federico Leva (Nemo)
I was surprised to learn a project like OpenSyllabus exists and works,
I emailed them to ask what it would take to do the same for other
languages/geographies.
Post by Federico Leva (Nemo)
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Gerard Meijssen
2017-04-11 10:51:56 UTC
Permalink
Raw Message
Hoi,
Classification as we have it is a wonder. It is there and it cannot be
explained. It does serve a purpose though.
Thanks,
GerardM
Post by Jane Darnell
Interesting query, thanks! How odd that "sitcom" is a subclass of
"literary work"! I never thought of it that way :)
On Tue, Apr 11, 2017 at 12:23 PM, Magnus Manske <
Post by Magnus Manske
The 500 most important (as in, number of Wiki sitelinks) literary works
that are (at least partially) in "original language" German, according to
http://tinyurl.com/mzhd8na
"The Big Bang Theory" item might need some review, but the rest look good...
Just change the Q188 and the language code for your favourite language!
Post by Andrea Zanni
https://it.wikisource.org/wiki/Wikisource:Canone_delle_opere
_della_letteratura_italiana
Ideally, we should have an item (a "work" item, so basically the one
with a Wikipedia article) on Wikidata for each one.
Than we can count how many Wikipedias have an article on it. Basically
it's Tpt's idea using wikidata and sitelinks.
Aubrey
You can always start with the lists per country (if they exist). So for
example I made an article about the first 500 of such a "1000 most
https://en.wikipedia.org/wiki/Canon_of_Dutch_Literature
A maybe simpler metric: the top 1000 Wikipedia articles about works per page view.
Thomas
Le 11 avr. 2017 à 09:42, mathieu stumpf guntz <
Hi Nemo,
We may establish a list a the "1000 works that every Wikisource should
have" (with translation possibly needed).
What metric could we use to define such a list? Maybe reference
frequency, but it requires statistics whose availability is unknown to me.
Statistically,
psychoslave
Post by Federico Leva (Nemo)
One issue sometimes raised about Wikisource is how we know that we're
working on the "right" books. Internet Archive is planning to textbooks
Post by Federico Leva (Nemo)
http://blog.archive.org/2017/03/29/books-donated-for-macarth
ur-foundation-100change-challenge-from-bookmooch-users/
Post by Federico Leva (Nemo)
I was surprised to learn a project like OpenSyllabus exists and
works, I emailed them to ask what it would take to do the same for other
languages/geographies.
Post by Federico Leva (Nemo)
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Jane Darnell
2017-04-11 11:02:00 UTC
Permalink
Raw Message
Yes I agree - totally wonderful. And there are more ways to make a more
meaningful query out of this (In Dutch #1 is Barbapapa and in English the
Simpsons take 1st place), by either specifying it can'be a film, or just
filtering for inception date before 1970
Post by Gerard Meijssen
Hoi,
Classification as we have it is a wonder. It is there and it cannot be
explained. It does serve a purpose though.
Thanks,
GerardM
Post by Jane Darnell
Interesting query, thanks! How odd that "sitcom" is a subclass of
"literary work"! I never thought of it that way :)
On Tue, Apr 11, 2017 at 12:23 PM, Magnus Manske <
Post by Magnus Manske
The 500 most important (as in, number of Wiki sitelinks) literary works
that are (at least partially) in "original language" German, according to
http://tinyurl.com/mzhd8na
"The Big Bang Theory" item might need some review, but the rest look good...
Just change the Q188 and the language code for your favourite language!
Post by Andrea Zanni
https://it.wikisource.org/wiki/Wikisource:Canone_delle_opere
_della_letteratura_italiana
Ideally, we should have an item (a "work" item, so basically the one
with a Wikipedia article) on Wikidata for each one.
Than we can count how many Wikipedias have an article on it. Basically
it's Tpt's idea using wikidata and sitelinks.
Aubrey
You can always start with the lists per country (if they exist). So for
example I made an article about the first 500 of such a "1000 most
https://en.wikipedia.org/wiki/Canon_of_Dutch_Literature
A maybe simpler metric: the top 1000 Wikipedia articles about works per page view.
Thomas
Le 11 avr. 2017 à 09:42, mathieu stumpf guntz <
Hi Nemo,
We may establish a list a the "1000 works that every Wikisource
should have" (with translation possibly needed).
What metric could we use to define such a list? Maybe reference
frequency, but it requires statistics whose availability is unknown to me.
Statistically,
psychoslave
Post by Federico Leva (Nemo)
One issue sometimes raised about Wikisource is how we know that
we're working on the "right" books. Internet Archive is planning to
textbooks starting from those which are most frequently assigned in USA
Post by Federico Leva (Nemo)
http://blog.archive.org/2017/03/29/books-donated-for-macarth
ur-foundation-100change-challenge-from-bookmooch-users/
Post by Federico Leva (Nemo)
I was surprised to learn a project like OpenSyllabus exists and
works, I emailed them to ask what it would take to do the same for other
languages/geographies.
Post by Federico Leva (Nemo)
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
ankry.wiki
2017-04-11 09:46:15 UTC
Permalink
Raw Message
I doubt we can find 1000 works with PD translations into each Wikisource language, including Latin and Sanskrit.
It would be hard to find 10. Mostly ancient.

Unlike Wikipedia, we present content that has already been created by somebody. We are not creating that ourselves.
(except few ws accepting Wikisource translations)

Ankry
Post by mathieu stumpf guntz
Hi Nemo,
We may establish a list a the "1000 works that every Wikisource should
have" (with translation possibly needed).
What metric could we use to define such a list? Maybe reference
frequency, but it requires statistics whose availability is unknown to me.
Statistically,
psychoslave
Post by Federico Leva (Nemo)
One issue sometimes raised about Wikisource is how we know that we're
working on the "right" books. Internet Archive is planning to
textbooks starting from those which are most frequently assigned in
http://blog.archive.org/2017/03/29/books-donated-for-macarthur-foundation-100change-challenge-from-bookmooch-users/
I was surprised to learn a project like OpenSyllabus exists and works,
I emailed them to ask what it would take to do the same for other
languages/geographies.
Nemo
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
David Starner
2017-04-11 11:17:03 UTC
Permalink
Raw Message
Post by ankry.wiki
I doubt we can find 1000 works with PD translations into each Wikisource
language, including Latin and Sanskrit.
It would be hard to find 10. Mostly ancient.
Unlike Wikipedia, we present content that has already been created by
somebody. We are not creating that ourselves.
(except few ws accepting Wikisource translations)
<https://lists.wikimedia.org/mailman/listinfo/wikisource-l>
How many Wikisources don't accept user translations? I'd guess that at
least half of them do.

It may not be universal, but you'll never know how many of those works
actually have PD translations until you actually search for them. A list
can at least provoke the search.
Nicolas VIGNERON
2017-04-11 12:06:02 UTC
Permalink
Raw Message
Post by David Starner
Post by ankry.wiki
I doubt we can find 1000 works with PD translations into each Wikisource
language, including Latin and Sanskrit.
Post by David Starner
Post by ankry.wiki
It would be hard to find 10. Mostly ancient.
Unlike Wikipedia, we present content that has already been created by
somebody. We are not creating that ourselves.
Post by David Starner
Post by ankry.wiki
(except few ws accepting Wikisource translations)
How many Wikisources don't accept user translations? I'd guess that at
least half of them do.

Good question. We should store clearly this information somewhere (on
https://www.wikidata.org/wiki/Q19335648 and local pages ?).
Post by David Starner
It may not be universal, but you'll never know how many of those works
actually have PD translations until you actually search for them. A list
can at least provoke the search.

Exactly.
I can easily find to 10 works in most languages of the planet (The Bible,
the Universal Declaration of Human Rights, Shakespeare, Conan Doyle,
Dickens, Stevenson, Verne, some important international treaty and
publication from the Vatican ; it's already a lot more than 10 works
available in more than 100 languages)

Speaking of the UN, the UNESCO created the Index Translationum (
http://www.unesco.org/xtrans/bsstatlist.aspx ) that can be helpful here.

Cdlt, ~nicolas

PS: Latin or Sanskrit are not the thoughest challenges, try Breton or
Venetian :P (by the way, the UDHR exist in these 4 languages and 500 more
;) only the Bible has more translations).
Nicolas VIGNERON
2017-04-11 12:46:07 UTC
Permalink
Raw Message
Magnus request was very good but here an other one : the 100 author with
the most page on Wikisource.

http://tinyurl.com/mut5gzn

Caveat, there is still a lot of work to do on Wikidata, some pages are
probably missing (I can help there if needed) and the real numbers are
higher than this request show.
(and sorry, the label service label didn't work "time out")

The top 10 is :
id name number
wd:Q16867 Edgar Allan Poe 28
wd:Q9061 Marx 27
wd:Q692 Shakespeare 25
wd:Q34787 Engels 24
wd:Q7243 Tolstoï 23
wd:Q22670 Schiller 21
wd:Q7200 Pushkin 21
wd:Q5879 Goethe 20
wd:Q6197 Horace 20
wd:Q501 Baudelaire 20
(not exactly what I foresee before, the UN texts and the Bible has no
unique author, it's logical it doesn't show here ; I forgot the russian
author, mea magna culpa! and in fine, I'm surprised and sad not to see
Doyle, Dickens, Stevenson, Verne here... at least, they are not far behind
and they made it to the top 100 ;) )

Cdlt, ~nicolas
ankry.wiki
2017-04-11 14:36:42 UTC
Permalink
Raw Message
Post by Nicolas VIGNERON
Post by ankry.wiki
I doubt we can find 1000 works with PD translations into each Wikisource
language, including Latin and Sanskrit.
It would be hard to find 10. Mostly ancient.
Unlike Wikipedia, we present content that has already been created by somebody.
We are not creating that ourselves.
(except few ws accepting Wikisource translations)
How many Wikisources don't accept user translations? I'd guess that at least
half of them do.
Good question. We should store clearly this information somewhere (on
https://www.wikidata.org/wiki/Q19335648 and local pages ?).
We do:
https://wikisource.org/wiki/Wikisource:Subdomain_coordination
At least 4 do not allow translations.
Post by Nicolas VIGNERON
It may not be universal, but you'll never know how many of those works
actually have PD translations until you actually search for them. A list can
at least provoke the search.
Exactly.
I can easily find to 10 works in most languages of the planet (The Bible, the
Universal Declaration of Human Rights, Shakespeare, Conan Doyle, Dickens, Stevenson,
Verne, some important international treaty and publication from the Vatican ;
it's already a lot more than 10 works available in more than 100 languages)
most != all (Most Wikisource should have... != All Wikisource should have...)
Post by Nicolas VIGNERON
Speaking of the UN, the UNESCO created the Index Translationum
( http://www.unesco.org/xtrans/bsstatlist.aspx ) that can be helpful here.
Cdlt, ~nicolas
PS: Latin or Sanskrit are not the thoughest challenges, try Breton or Venetian
:P (by the way, the UDHR exist in these 4 languages and 500 more ;) only the
Bible has more translations).
I have intentionally chosen dead languages to point out that "all" should not
be the goal.

Concerning, UDHR, we have unclear copyright status even for Polish translation:
it is not considered to be an official legal act, no "official" translation;
translated by a Foundation which say nothing about copyright. And even,
translations of foreign legal acts are considered copyrighted in Poland
(according to opinions we have).

Translation copyright problems may exist for many translations of Conan Doyle,
Dickens, Stevenson or Verne.
I also doubt we will get a Wikisource translation of "The Posthumous Papers of the
Pickwick Club" into eg. Lithuanian (while ltwikisource seems to be like
a single-user project - at least recently).

We can talk about 1000-100 "base" works in, maybe, 5-10 most active Wikisources.

Ankry
Nicolas VIGNERON
2017-04-11 14:59:42 UTC
Permalink
Raw Message
W dniu 2017-04-11 14:06:02 uÅŒytkownik Nicolas VIGNERON <
Post by Nicolas VIGNERON
Post by David Starner
Post by ankry.wiki
I doubt we can find 1000 works with PD translations into each
Wikisource
Post by Nicolas VIGNERON
Post by David Starner
Post by ankry.wiki
language, including Latin and Sanskrit.
It would be hard to find 10. Mostly ancient.
Unlike Wikipedia, we present content that has already been created by
somebody.
Post by Nicolas VIGNERON
Post by David Starner
Post by ankry.wiki
We are not creating that ourselves.
(except few ws accepting Wikisource translations)
How many Wikisources don't accept user translations? I'd guess that at
least
Post by Nicolas VIGNERON
Post by David Starner
half of them do.
Good question. We should store clearly this information somewhere (on
https://www.wikidata.org/wiki/Q19335648 and local pages ?).
https://wikisource.org/wiki/Wikisource:Subdomain_coordination
At least 4 do not allow translations.
Post by Nicolas VIGNERON
Post by David Starner
It may not be universal, but you'll never know how many of those works
actually have PD translations until you actually search for them. A list
can
Post by Nicolas VIGNERON
at least provoke the search.
Exactly.
I can easily find to 10 works in most languages of the planet (The
Bible, the
Post by Nicolas VIGNERON
Universal Declaration of Human Rights, Shakespeare, Conan Doyle,
Dickens, Stevenson,
Post by Nicolas VIGNERON
Verne, some important international treaty and publication from the
Vatican ;
Post by Nicolas VIGNERON
it's already a lot more than 10 works available in more than 100
languages)
most != all (Most Wikisource should have... != All Wikisource should have...)
True but should != must ; for me here, it's a suggestion, not an obligation
(either way, nothing can really be obligated on a wiki ;) ).
Speaking of the UN, the UNESCO created the Index Translationum
Post by Nicolas VIGNERON
( http://www.unesco.org/xtrans/bsstatlist.aspx ) that can be helpful
here.
Post by Nicolas VIGNERON
Cdlt, ~nicolas
PS: Latin or Sanskrit are not the thoughest challenges, try Breton or
Venetian
Post by Nicolas VIGNERON
:P (by the way, the UDHR exist in these 4 languages and 500 more ;) only
the
Post by Nicolas VIGNERON
Bible has more translations).
I have intentionally chosen dead languages to point out that "all" should not
be the goal.
Latin and Sanskrit are not entirely dead and are much more active than most
languages of the planet (more than Breton or Venitian).
I"m not sure, we have the same understanding of « goal », for me it's a
direction, something we should tend toward too, not an obligation that have
to be met.
it is not considered to be an official legal act, no "official" translation;
translated by a Foundation which say nothing about copyright. And even,
translations of foreign legal acts are considered copyrighted in Poland
(according to opinions we have).
Uh... strange... I thought UN documents were in public domain (not all of
them but clearly official documents like the UDHR, and that's why we have
https://commons.wikimedia.org/wiki/Template:PD-UN-doc ).
And http://www.ohchr.org/EN/AboutUs/Pages/Copyright.aspx seems quite
explicit to me.
Translation copyright problems may exist for many translations of Conan Doyle,
Dickens, Stevenson or Verne.
I also doubt we will get a Wikisource translation of "The Posthumous Papers of the
Pickwick Club" into eg. Lithuanian (while ltwikisource seems to be like
a single-user project - at least recently).
Sure, but this is clearly not the work I had in mind ;)
We can talk about 1000-100 "base" works in, maybe, 5-10 most active Wikisources.
Exactly! Let's go! Where can we store this? (beside Wikidata of course)

Cdlt, ~nicolas
mathieu stumpf guntz
2017-04-11 23:03:25 UTC
Permalink
Raw Message
That's not goals for the end of fiscal years, but driving target, just
like having a list of articles every Wikipedia should have. :)
Post by ankry.wiki
Post by Nicolas VIGNERON
Post by ankry.wiki
I doubt we can find 1000 works with PD translations into each Wikisource
language, including Latin and Sanskrit.
It would be hard to find 10. Mostly ancient.
Unlike Wikipedia, we present content that has already been created by somebody.
We are not creating that ourselves.
(except few ws accepting Wikisource translations)
How many Wikisources don't accept user translations? I'd guess that at least
half of them do.
Good question. We should store clearly this information somewhere (on
https://www.wikidata.org/wiki/Q19335648 and local pages ?).
https://wikisource.org/wiki/Wikisource:Subdomain_coordination
At least 4 do not allow translations.
Post by Nicolas VIGNERON
It may not be universal, but you'll never know how many of those works
actually have PD translations until you actually search for them. A list can
at least provoke the search.
Exactly.
I can easily find to 10 works in most languages of the planet (The Bible, the
Universal Declaration of Human Rights, Shakespeare, Conan Doyle, Dickens, Stevenson,
Verne, some important international treaty and publication from the Vatican ;
it's already a lot more than 10 works available in more than 100 languages)
most != all (Most Wikisource should have... != All Wikisource should have...)
Post by Nicolas VIGNERON
Speaking of the UN, the UNESCO created the Index Translationum
( http://www.unesco.org/xtrans/bsstatlist.aspx ) that can be helpful here.
Cdlt, ~nicolas
PS: Latin or Sanskrit are not the thoughest challenges, try Breton or Venetian
:P (by the way, the UDHR exist in these 4 languages and 500 more ;) only the
Bible has more translations).
I have intentionally chosen dead languages to point out that "all" should not
be the goal.
it is not considered to be an official legal act, no "official" translation;
translated by a Foundation which say nothing about copyright. And even,
translations of foreign legal acts are considered copyrighted in Poland
(according to opinions we have).
Translation copyright problems may exist for many translations of Conan Doyle,
Dickens, Stevenson or Verne.
I also doubt we will get a Wikisource translation of "The Posthumous Papers of the
Pickwick Club" into eg. Lithuanian (while ltwikisource seems to be like
a single-user project - at least recently).
We can talk about 1000-100 "base" works in, maybe, 5-10 most active Wikisources.
Ankry
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Andrea Zanni
2017-04-13 10:10:11 UTC
Permalink
Raw Message
I would like to bring back the discussion to the Wikimedia Strategy (of
course, you're free to fork this thread in several others: more
discussions, the better ;-)

Last week I participated in the Wikimedia Conference,
this year focused on Strategy.

We had several sessions in which 200 people from all over the movement
brainstormed and discussed freely about one single question: where do we
want to be, in 2030.
Personally, I advocated and pushed for a more "olistic" approach: not just
an encyclopedia, but a platform for accessing and creating knowledge, in
whatever form.
There is somewhat a general consensus on that, but as a Wikisource
community I think it's *fundamental* to give our input, and push towards a
Wikimedia that is *beyond Wikipedia*.

Thus, I encourage you again to write here your dream about Wikimedia in
2030: what would you like to see? where would you like to be? In the
Wikisource conference, we spoke a lot about language equity, community,
tech. I'm sure you're full of ideas and vision.

There are *no wrong answers*, and we still have few days to give our input
before the first stage of this long process ends.

Thanks!

On Wed, Apr 12, 2017 at 1:03 AM, mathieu stumpf guntz <
Post by mathieu stumpf guntz
That's not goals for the end of fiscal years, but driving target, just
like having a list of articles every Wikipedia should have. :)
W dniu 2017-04-11 14:06:02 uÅŒytkownik Nicolas VIGNERON <
Post by Nicolas VIGNERON
Post by ankry.wiki
I doubt we can find 1000 works with PD translations into each Wikisource
language, including Latin and Sanskrit.
It would be hard to find 10. Mostly ancient.
Unlike Wikipedia, we present content that has already been created by somebody.
We are not creating that ourselves.
(except few ws accepting Wikisource translations)
How many Wikisources don't accept user translations? I'd guess that at least
half of them do.
Good question. We should store clearly this information somewhere (on
https://www.wikidata.org/wiki/Q19335648 and local pages ?).
https://wikisource.org/wiki/Wikisource:Subdomain_coordination
At least 4 do not allow translations.
It may not be universal, but you'll never know how many of those works
Post by Nicolas VIGNERON
actually have PD translations until you actually search for them. A list can
at least provoke the search.
Exactly.
I can easily find to 10 works in most languages of the planet (The Bible, the
Universal Declaration of Human Rights, Shakespeare, Conan Doyle, Dickens, Stevenson,
Verne, some important international treaty and publication from the Vatican ;
it's already a lot more than 10 works available in more than 100 languages)
most != all (Most Wikisource should have... != All Wikisource should have...)
Speaking of the UN, the UNESCO created the Index Translationum
Post by Nicolas VIGNERON
( http://www.unesco.org/xtrans/bsstatlist.aspx ) that can be helpful here.
Cdlt, ~nicolas
PS: Latin or Sanskrit are not the thoughest challenges, try Breton or Venetian
:P (by the way, the UDHR exist in these 4 languages and 500 more ;) only the
Bible has more translations).
I have intentionally chosen dead languages to point out that "all" should not
be the goal.
it is not considered to be an official legal act, no "official" translation;
translated by a Foundation which say nothing about copyright. And even,
translations of foreign legal acts are considered copyrighted in Poland
(according to opinions we have).
Translation copyright problems may exist for many translations of Conan Doyle,
Dickens, Stevenson or Verne.
I also doubt we will get a Wikisource translation of "The Posthumous Papers of the
Pickwick Club" into eg. Lithuanian (while ltwikisource seems to be like
a single-user project - at least recently).
We can talk about 1000-100 "base" works in, maybe, 5-10 most active Wikisources.
Ankry
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Andrea Zanni
2017-04-18 19:59:04 UTC
Permalink
Raw Message
I tried to put some of the things we said on this page on Meta:
https://meta.wikimedia.org/wiki/Strategy/Wikimedia_movement/2017/Sources/Wikisource_Mailing_list

Feel free to discuss them.
Basically, I summarised what Asaf, David and I said.

There will another occasion for discussion, so feel free, again, to jump in
at any time.

Aubrey
Post by Andrea Zanni
I would like to bring back the discussion to the Wikimedia Strategy (of
course, you're free to fork this thread in several others: more
discussions, the better ;-)
Last week I participated in the Wikimedia Conference,
this year focused on Strategy.
We had several sessions in which 200 people from all over the movement
brainstormed and discussed freely about one single question: where do we
want to be, in 2030.
Personally, I advocated and pushed for a more "olistic" approach: not just
an encyclopedia, but a platform for accessing and creating knowledge, in
whatever form.
There is somewhat a general consensus on that, but as a Wikisource
community I think it's *fundamental* to give our input, and push towards a
Wikimedia that is *beyond Wikipedia*.
Thus, I encourage you again to write here your dream about Wikimedia in
2030: what would you like to see? where would you like to be? In the
Wikisource conference, we spoke a lot about language equity, community,
tech. I'm sure you're full of ideas and vision.
There are *no wrong answers*, and we still have few days to give our input
before the first stage of this long process ends.
Thanks!
On Wed, Apr 12, 2017 at 1:03 AM, mathieu stumpf guntz <
Post by mathieu stumpf guntz
That's not goals for the end of fiscal years, but driving target, just
like having a list of articles every Wikipedia should have. :)
W dniu 2017-04-11 14:06:02 uÅŒytkownik Nicolas VIGNERON <
Post by Nicolas VIGNERON
Post by ankry.wiki
I doubt we can find 1000 works with PD translations into each Wikisource
language, including Latin and Sanskrit.
It would be hard to find 10. Mostly ancient.
Unlike Wikipedia, we present content that has already been created by somebody.
We are not creating that ourselves.
(except few ws accepting Wikisource translations)
How many Wikisources don't accept user translations? I'd guess that at least
half of them do.
Good question. We should store clearly this information somewhere (on
https://www.wikidata.org/wiki/Q19335648 and local pages ?).
https://wikisource.org/wiki/Wikisource:Subdomain_coordination
At least 4 do not allow translations.
It may not be universal, but you'll never know how many of those works
Post by Nicolas VIGNERON
actually have PD translations until you actually search for them. A list can
at least provoke the search.
Exactly.
I can easily find to 10 works in most languages of the planet (The Bible, the
Universal Declaration of Human Rights, Shakespeare, Conan Doyle, Dickens, Stevenson,
Verne, some important international treaty and publication from the Vatican ;
it's already a lot more than 10 works available in more than 100 languages)
most != all (Most Wikisource should have... != All Wikisource should have...)
Speaking of the UN, the UNESCO created the Index Translationum
Post by Nicolas VIGNERON
( http://www.unesco.org/xtrans/bsstatlist.aspx ) that can be helpful here.
Cdlt, ~nicolas
PS: Latin or Sanskrit are not the thoughest challenges, try Breton or Venetian
:P (by the way, the UDHR exist in these 4 languages and 500 more ;) only the
Bible has more translations).
I have intentionally chosen dead languages to point out that "all" should not
be the goal.
it is not considered to be an official legal act, no "official" translation;
translated by a Foundation which say nothing about copyright. And even,
translations of foreign legal acts are considered copyrighted in Poland
(according to opinions we have).
Translation copyright problems may exist for many translations of Conan Doyle,
Dickens, Stevenson or Verne.
I also doubt we will get a Wikisource translation of "The Posthumous Papers of the
Pickwick Club" into eg. Lithuanian (while ltwikisource seems to be like
a single-user project - at least recently).
We can talk about 1000-100 "base" works in, maybe, 5-10 most active Wikisources.
Ankry
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Asaf Bartov
2017-04-18 20:42:31 UTC
Permalink
Raw Message
Thank you for doing us this service, Andrea!

A.
Post by Andrea Zanni
https://meta.wikimedia.org/wiki/Strategy/Wikimedia_movement/2017/Sources/Wikisource_Mailing_list
Feel free to discuss them.
Basically, I summarised what Asaf, David and I said.
There will another occasion for discussion, so feel free, again, to jump
in at any time.
Aubrey
Post by Andrea Zanni
I would like to bring back the discussion to the Wikimedia Strategy (of
course, you're free to fork this thread in several others: more
discussions, the better ;-)
Last week I participated in the Wikimedia Conference,
this year focused on Strategy.
We had several sessions in which 200 people from all over the movement
brainstormed and discussed freely about one single question: where do we
want to be, in 2030.
Personally, I advocated and pushed for a more "olistic" approach: not
just an encyclopedia, but a platform for accessing and creating knowledge,
in whatever form.
There is somewhat a general consensus on that, but as a Wikisource
community I think it's *fundamental* to give our input, and push towards a
Wikimedia that is *beyond Wikipedia*.
Thus, I encourage you again to write here your dream about Wikimedia in
2030: what would you like to see? where would you like to be? In the
Wikisource conference, we spoke a lot about language equity, community,
tech. I'm sure you're full of ideas and vision.
There are *no wrong answers*, and we still have few days to give our
input before the first stage of this long process ends.
Thanks!
On Wed, Apr 12, 2017 at 1:03 AM, mathieu stumpf guntz <
Post by mathieu stumpf guntz
That's not goals for the end of fiscal years, but driving target, just
like having a list of articles every Wikipedia should have. :)
W dniu 2017-04-11 14:06:02 uÅŒytkownik Nicolas VIGNERON <
Post by Nicolas VIGNERON
Post by ankry.wiki
I doubt we can find 1000 works with PD translations into each Wikisource
language, including Latin and Sanskrit.
It would be hard to find 10. Mostly ancient.
Unlike Wikipedia, we present content that has already been created by somebody.
We are not creating that ourselves.
(except few ws accepting Wikisource translations)
How many Wikisources don't accept user translations? I'd guess that at least
half of them do.
Good question. We should store clearly this information somewhere (on
https://www.wikidata.org/wiki/Q19335648 and local pages ?).
https://wikisource.org/wiki/Wikisource:Subdomain_coordination
At least 4 do not allow translations.
It may not be universal, but you'll never know how many of those works
Post by Nicolas VIGNERON
actually have PD translations until you actually search for them. A list can
at least provoke the search.
Exactly.
I can easily find to 10 works in most languages of the planet (The Bible, the
Universal Declaration of Human Rights, Shakespeare, Conan Doyle,
Dickens, Stevenson,
Verne, some important international treaty and publication from the Vatican ;
it's already a lot more than 10 works available in more than 100 languages)
most != all (Most Wikisource should have... != All Wikisource should have...)
Speaking of the UN, the UNESCO created the Index Translationum
Post by Nicolas VIGNERON
( http://www.unesco.org/xtrans/bsstatlist.aspx ) that can be helpful here.
Cdlt, ~nicolas
PS: Latin or Sanskrit are not the thoughest challenges, try Breton or Venetian
:P (by the way, the UDHR exist in these 4 languages and 500 more ;) only the
Bible has more translations).
I have intentionally chosen dead languages to point out that "all" should not
be the goal.
it is not considered to be an official legal act, no "official" translation;
translated by a Foundation which say nothing about copyright. And even,
translations of foreign legal acts are considered copyrighted in Poland
(according to opinions we have).
Translation copyright problems may exist for many translations of Conan Doyle,
Dickens, Stevenson or Verne.
I also doubt we will get a Wikisource translation of "The Posthumous Papers of the
Pickwick Club" into eg. Lithuanian (while ltwikisource seems to be like
a single-user project - at least recently).
We can talk about 1000-100 "base" works in, maybe, 5-10 most active Wikisources.
Ankry
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
_______________________________________________
Wikisource-l mailing list
https://lists.wikimedia.org/mailman/listinfo/wikisource-l
ankry.wiki
2017-04-11 17:07:51 UTC
Permalink
Raw Message
Post by ankry.wiki
Post by Nicolas VIGNERON
PS: Latin or Sanskrit are not the thoughest challenges, try Breton or Venetian
:P (by the way, the UDHR exist in these 4 languages and 500 more ;) only the
Bible has more translations).
I have intentionally chosen dead languages to point out that "all" should not
be the goal.
Latin and Sanskrit are not entirely dead and are much more active than most languages
of the planet (more than Breton or Venitian).
I"m not sure, we have the same understanding of « goal », for me it's a direction,
something we should tend toward too, not an obligation that have to be met.
I doubt Conan Doyle's or Verne's works in Latin will ever appear.
Post by ankry.wiki
it is not considered to be an official legal act, no "official" translation;
translated by a Foundation which say nothing about copyright. And even,
translations of foreign legal acts are considered copyrighted in Poland
(according to opinions we have).
Uh... strange... I thought UN documents were in public domain (not all of them
but clearly official documents like the UDHR, and that's why we have >
https://commons.wikimedia.org/wiki/Template:PD-UN-doc ).
And http://www.ohchr.org/EN/AboutUs/Pages/Copyright.aspx seems quite explicit to me.
Oficial UN documents are likely PD. But Polish is not an oficial UN language,
so there is no Polish version of UDHR as an *oficial* UN document.
The Polish translation of UDHR that is being published on UN web pages was
previously published in Poland, and then adopted by UN. I found no reason
that it could be not copyrighted in Polish copyright law. And it is, of course,
newer than 70 years.

Even, if it is PD in US, it is not PD in Poland (likely it is a fair use translation,
but the original translator/publisher is either unreachable or does not want to declare
its license. Just unclear status. "You can use it freely if not modified" is all we
have received.

I doubt thare is any point to create another Polish translation of this document.
Post by ankry.wiki
Translation copyright problems may exist for many translations of Conan Doyle,
Dickens, Stevenson or Verne.
I also doubt we will get a Wikisource translation of "The Posthumous Papers of the
Pickwick Club" into eg. Lithuanian (while ltwikisource seems to be like
a single-user project - at least recently).
Sure, but this is clearly not the work I had in mind ;)
I am afraid, this applies to any work of any author yet unpublished in Lithuanian.
Post by ankry.wiki
We can talk about 1000-100 "base" works in, maybe, 5-10 most active Wikisources.
Exactly! Let's go! Where can we store this? (beside Wikidata of course)
Maybe somewhere in http://wikisource.org (sourceswiki)?

IMO, it is the best place for something applicable to multiple wikisource sites.
Post by ankry.wiki
Cdlt, ~nicolas
Ankry
Nicolas VIGNERON
2017-04-11 17:51:04 UTC
Permalink
Raw Message
[snip]

I doubt Conan Doyle's or Verne's works in Latin will ever appear.
You of little faith, a 15 seconds search gave me :
http://ephemeris.alcuinus.net/holmesiaca.php (via
https://la.wikipedia.org/wiki/Arthurus_Conan_Doyle)
You often just have to search, as I said earlier, latin is still a vivace
language (and I've got 'Harrius Potter et Philosophi Lapis' on my library)
We shouldn't before we even began (and that's why this list sound like a
good idea to me).

[snip and separate private mail about UN copyright]

Maybe somewhere in http://wikisource.org (sourceswiki)?
Post by ankry.wiki
IMO, it is the best place for something applicable to multiple wikisource sites.
Ok, I'll start there.
Beginning on the Scriptorium :
https://wikisource.org/wiki/Wikisource:Scriptorium#List_of_most_translated_works
feel free to join.

Cdlt, ~nicolas
Loading...