LT autocombines works by different authors

ConversazioniBug Collectors

Iscriviti a LibraryThing per pubblicare un messaggio.

LT autocombines works by different authors

1DuncanHill
Set 6, 8:27am

I just added David Lloyd George by Cyril Parry. LT told me I had another copy - which I knew was wrong. It had combined it with the work of the same title (I don't seem to be able to use touchstones for different works of the same title in the same message, it's https://www.librarything.com/work/6402341) by Bryn Parry, Emyr Price, and Gareth Price.

As well as having different authors, the publishing details are also different.

This seems to me to be undesirable behaviour.

2AnnieMod
Modificato: Set 6, 11:08am

Questo messaggio è stato cancellato dall'autore.

3timspalding
Modificato: Set 7, 12:11pm

Absent user decisions, the system makes guesses. Guesses involve trade-offs between false positives and false negatives. A book with the same title and the same last name of the principle author? Well, it made that guess. And most of the time it would be right. Sure, if the data were perfect, it could consider the whole name, but author names vary so much that they aren't more than one piece of data in a tricky balancing act. As for publication details, if it did that, there'd be no work system, because publication details almost ALWAYS differ!

4DuncanHill
Set 7, 4:11pm

Could the system be made to ask "Is this the same work?"

5Foretopman
Modificato: Set 7, 4:22pm

>4 DuncanHill: Oh, I think users would tire of being asked very quickly. And if you try to develop an algorithm that only asks in doubtful situations, you are right back to where the system is making guesses that involve trade-offs.

6MarthaJeanne
Set 7, 5:04pm

In this case we are talking about only three copies altogether. My experience is that once there are more copies the autocombiner is a lot more accurate.

7DuncanHill
Set 8, 12:05am

"Absent user decisions, the system makes guesses" - the user decides to enter the correct authorname, the system guesses that it's wrong and should be ignored. "Cyril" and "Bryn" are very different, it's not like one was a "Cyril" and the other a "C".

"A book with the same title and the same last name of the principle author? Well, it made that guess. And most of the time it would be right" Really? Most of the time it's right to ignore that the author is different? And anyway Bryn Parry ''isn't'' the principal author of the other work, he's the one who comes first alphabetically. It's an LT design choice to treat the first-named author as principal.

It's unlikely that a user would spot this error if they did not already have the other work.

8MarthaJeanne
Set 8, 3:38am

Actually, I look at the work information every time I enter a book. I'm more likely to have to combine my copy than separate it.

You are also making the assumption that users make sure they are entering the author name correctly. Most users get the title right, but there are plenty who don't care about really bad author names or ISBNs.

9rosalita
Set 8, 6:06am

>8 MarthaJeanne: Indeed, there are plenty of book records that are just title, no author entered at all.

I cannot imagine not going at least a basic edit of the book data after I add a book. I always add a Date Acquired and a From Where? so it's simple to also give the title/author fields a look at the same time.

10timspalding
Modificato: Set 8, 9:52am

It's not that users enter bad user names, but that the data itself is tricky. If somebody's name is Sir Abraham Barton Cunningham, Jr. Bishop of Dorchester, there are probably 7^7 ways that data will come out. :)

11aspirit
Set 8, 12:41pm

Works with multiple authors are often entered with only one author, and members don't always choose the same one. I've noticed a tendency for the system to keep editions separate, though. I usually have to combine the editions with different author names myself.

That the works in this case auto-combined is interesting.

12waltzmn
Set 8, 1:00pm

>11 aspirit:

I can give an example of a different combination issue, which has bugged me for years, and which I tried desperately, and without success, to solve by splitting... well... everything I could.

To make this as clear as I can, the ISBN is 0806114169.

The title (not author, not really), as given on the title page, is The Canterbury Tales, Geoffrey Chaucer: A Facsimile and Transcription of the Hengwrt Manuscript, with Variants from the Ellesmere Manuscript. It is edited by Paul G. Ruggiers (not Chaucer, note!), with introductions by Donald C. Baker, A. I. Doyle, and M. B. Parkes.

LibraryThing insistently puts it in with The Canterbury Tales, and I cannot make it break the link.

This even though:
1. It is not by Chaucer, it's by Ruggiers. It contains Chaucer, but that's different. One does not file, say, George Adam Smith's commentary on the Book of Book of Isaiah under the Bible, even though Smith gives a full translation.
2. It's a book of photographs. The photographs are transcribed, but people pay the (quite high) cost of the book for the photographs, not for the text. Besides, the text has a critical apparatus, so it's not exactly a continuous text anyway.
3. Even if you ignore points (1) and (2), the Hengwrt Manuscript, although it is the single most important manuscript of The Canterbury Tales, is not the Tales as we now know them; it omits several tales. (It appears to have been a hasty copy, probably made shortly after Chaucer's death, of such of his manuscript tales as could be found immediately at hand.)

I personally think it should be filed under Ruggiers, not Chaucer, but I'll admit that it has Chaucer's name on the spine. If LibraryThing files Chaucer as the primary author, I can live with it -- but don't file it with the Canterbury Tales! It's an independent work.

13Maddz
Set 8, 1:12pm

>12 waltzmn: Can you not add a work-to-work relationship so it won't auto-combine?

14AnnieMod
Modificato: Set 8, 1:16pm

>12 waltzmn: It puts it there because the title before the : matches... and I suspect people put it there manually as well because based on the title and with no research it looks like it belongs. Once one copy is in, any new one will join it.

Separate it, add disambiguation note explaining exactly what you explained up in 3 (the different contents) and as much as you want from the rest and add relationship. That should keep it out.

15saltmanz
Set 8, 1:44pm

>12 waltzmn: >14 AnnieMod: There are 6 separate editions of the "Hengwrt Manuscript" versions already combined with Canterbury Tales. They'll all need to be separated out into a single new work.

16AnnieMod
Set 8, 1:51pm

>15 saltmanz: Yep - saw that. I can do that if >12 waltzmn: does not want to. :)

17waltzmn
Set 8, 3:05pm

>16 AnnieMod:

>15 saltmanz: saltmanz: Yep - saw that. I can do that if >12 waltzmn: waltzmn: does not want to. :)

I tried to separate them in the past, using methods that work for most other books, and it wouldn't stay split. So there is presumably something I don't understand about splitting. If someone can split them and make it stick, I can do work-to-work relationships and disambiguation and such. :-)

18AnnieMod
Modificato: Set 8, 3:22pm

>17 waltzmn:

Being images and not text and people paying for it would not have made it a separate version. Having different contents does (plus Variorum editions are a lot more than just the works themselves (looking at one of my Variorum Shakespeares...) :)

Oh and here is the new work: https://www.librarything.com/work/27041631

Not sure about the relationship so this MAY need editing but that should stop someone from combining without clarifying with someone...

19waltzmn
Set 8, 5:03pm

>18 AnnieMod:

Being images and not text and people paying for it would not have made it a separate version.

The point about price I grant; the point of that comment about price was just that this is not some $3.95 copy of The Canterbury Tales -- nor even a Folio Society reprint or the like. English literature students have to save for months to get a copy of this thing. :-) If they have it, they want to be able to demonstrate the fact. :-)

Litigating the question of images is beside the point in this Talk area, I suppose, but the fact that it's images of a manuscript that is not the autograph means that it's a different branch of the family tree of the work than any printed edition.

Having different contents does (plus Variorum editions are a lot more than just the works themselves (looking at one of my Variorum Shakespeares...) :)

:-) Depends on the variorum edition, FWIW. A variorum Beowulf is usually just another Beowulf. The Variorum Edition of the New Testament is really just a King James Bible with an exceptionally useful set of footnotes. The Shakespeare variorums (at least the ones I have) are more properly textual commentaries than variorums. But other variorums are truly different beasts.

Oh and here is the new work: https://www.librarything.com/work/27041631

Not sure about the relationship so this MAY need editing but that should stop someone from combining without clarifying with someone...

Thank you.

I think that's the best relationship available. There isn't a perfect one.

It ended up with an odd cover that I don't think belongs there, but it's too blurry to be sure. :-)

I've started adding other authors and common knowledge data and such to try to make all these things clear.

Thank you again. This really has been bugging me for years. :-)

20bnielsen
Set 9, 2:00am

>10 timspalding: LOL! All too true. And for a russian writer you can multiply by the number of translitteration schemes used in GB, USA, France, Spain, Denmark, ...