Wikipedia talk:Persondata

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Rajah (talk | contribs) at 06:20, 16 October 2007 (→‎Added instructions for extraction from database dump). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

WikiProject iconBiography Project‑class
WikiProject iconThis page is within the scope of WikiProject Biography, a collaborative effort to create, develop and organize Wikipedia's articles about people. All interested editors are invited to join the project and contribute to the discussion. For instructions on how to use this banner, please refer to the documentation.
ProjectThis page does not require a rating on Wikipedia's content assessment scale.
Protected Template:Persondata has been protected indefinitely. Use {{editprotected}} on this page to request an edit.

--Rajah 06:19, 16 October 2007 (UTC)[reply]

Use on pages listing multiple people...

I'm looking at Delirious?_musicians and wondering if PERSONDATA can appear multiple times on the same page and not cause problems. Any thoughts? Dan, the CowMan 03:17, 10 April 2007 (UTC)[reply]

For now, I would stick to adding persondata to people with articles about themselves. --Rajah 05:24, 2 May 2007 (UTC)[reply]
You can, though, use {{Hcard-bday}} to generate in-line hCard microformats for each person. Andy Mabbett 08:58, 16 June 2007 (UTC)[reply]

hCard microformats in infoboxes

Further to earlier discussions, a number of biography-related infoboxes now produce an hCard microformat. Please feel free to add the necessary mark-up to more. (Cheifly, that's class="vacrd" on the whole infobox and "class="fn" on the pagename or name field.) Note that the date of birth is only included if {{Birth date}} or {{Birth date and age}} is used. Andy Mabbett 17:14, 19 April 2007 (UTC)[reply]

This is total madness. How did someone think it was a good idea to have redundant metadata? The hcard functionality should have been implemented in Persondata, not in 500 different infoboxes. What a mess. Kaldari 22:44, 27 August 2007 (UTC)[reply]

NAME attributes

A couple of questions:

- Are nicknames acceptable within the ALTERNATIVE NAMES attribute?

- Could it be further clarified as to what should populate NAME and ALTERNATIVE NAMES? For example, for Tony Blair, his full birth name is in ALTERNATIVE NAMES, and his familiar name in NAME, but for Steven Gerrard it is the other way around.

Thanks, --Jameboy 16:16, 20 April 2007 (UTC)[reply]

The Tony Blair example is how it is supposed to work. For Steven Gerrard, his full name in the name field is enough. Having his name sans middle name in the Alternative field doesn't add any information. (If anything you would put Gerrard, Steven in the name field and Gerrard, Steven Middlename in the Alternative field. --Rajah 20:32, 20 April 2007 (UTC)[reply]
Thanks. What about nicknames though? --Jameboy 14:28, 22 April 2007 (UTC)[reply]
I would say it would depend on the nickname and how uniquely identifying it is. e.g. "Honest Abe" shouldn't be in Abraham Lincoln, but Splendid Splinter could be in Ted Williams. For nobility, nicknames are sometimes the first name in the persondata, e.g. Catherine II of Russia Generally, if a nickname universally and uniquely identifies someone, I think it should be listed in the Alternative Names section, if it fails to meet those criteria it should be omitted. Do you have a specific example? --Rajah 20:33, 28 April 2007 (UTC)[reply]

Colors

Is light gray really a good color to put on a white background? Maybe it could be darker and in bold. ~ EdBoy[c] 03:15, 12 May 2007 (UTC)[reply]

I guess most editors who use Persondata are familiar with the fields. Since the data is only a set of meta-data without any relevance for the article at all, it's not that bad to see it in a decent colour scheme. IIRC the colours are defined by CSS, so you should by able to define your own CSS rules (dark, bold, blinking, CAPS, ...). --32X 20:23, 15 May 2007 (UTC)[reply]

Hispanic Surnames

Could the instructions please be specific, that the surname generally used in Spanish language names is the first surname where two are given. For example, I have just become aware of this template because of one of the pages I have on my watchlist, Ecuadorian footballer Ulises de la Cruz. His mother's surname is Bernal, and so the full, formal version of his name is Ulises de la Cruz Bernal, but this is not in common use. He has,however, been given a persondata box showing him as

|NAME= Bernal, Ulises de la Cruz

There will be many errors of this type if this is not made very clear. I am unsure as to how scripts to automatically extract information might avoid this error. Kevin McE 16:47, 15 May 2007 (UTC)[reply]

Yes, the script is a guide. The human being who was using should have realized that Bernoid was the maternal surname and that it was not Ulises' surname. That's why the name of his article is Ulises de la Cruz, with no Bernoid. Generally, I think editors should either stick to names in languages with which they are somewhat conversant, or learn the rules for the language/culture they are editing, so that errors of this type don't propogate. --Rajah 01:48, 16 May 2007 (UTC)[reply]
Interestingly, the german wikipedia gives this names as: NAME=de la Cruz Bernardo, Ulises , while the Spanish wikipedia has to have his persondata added. --Rajah 01:50, 16 May 2007 (UTC)[reply]

German Wikipedia

How is telling people how many articles on the German Wikipedia have persondata useful information on the English Wikipedia? Voretus 17:00, 17 May 2007 (UTC)[reply]

One way it is potentially useful is that a skilled programmer could transfer the persondata wholesale in the same fashion as the Interwiki bot does with interwiki links. --Rajah 18:47, 17 May 2007 (UTC)[reply]
Motivation. Compare it with my answer for So What Does This Do Now. You can't work with only a few articles, you need a larger base. --> "So they've reached > 150k? Wow. We'll try to be better in a few months." (hopefully) --32X 23:18, 17 May 2007 (UTC)[reply]

My two cents on Wikipeda's handling of metadata

Why not put Metadata in it's own tab for any given article? Thus all articles would have an "Article" tab, a "Discussion" tab and a "Metadata" tab. This would keep the article area clean of meta data, the tab could be (hypothetically) be limited to more advanced users. I realize this might be a bit of a pain in terms of extending the MediaWiki software, but long haul would this not be a major improvement? —Wikijeff 16:26, 12 June 2007 (UTC)[reply]

I totally agree, as I mentioned earlier [1]. For now though, this is the best compromise we can make. Mostly the reason this issue isn't that dwelt upon is that only a very small minority appreciate it. I'm slowly working on some offline wikipedia data mining/visualization tools that should, hopefully, get people fired up about it. --Rajah 00:37, 13 June 2007 (UTC)[reply]
I also agree that this would be a great step, but it needs to be raised with the devs (who for all I know may already be working on something of this nature). Community support is only marginally relevant for MediaWiki software issues such as this. -- Visviva 02:27, 13 June 2007 (UTC)[reply]
Until there is a change in the software, it seems to me that the best option is to store metadata on a subpage (I think this has been mentioned before). I discuss this below for the Persondata template, but in fact, the current version of my demonstration allows for arbitrary metadata: the data needed for a particular purpose (such as Persondata) are selected using a key. Geometry guy 20:07, 15 June 2007 (UTC)[reply]

Persondata on a subpage

There has been some discussion here about why Persondata should be separate from the infobox, how birthdates and names are formatted, problems with entering the same information several times, and so on.

A lot of these issues would be easier to deal with if the Persondata were stored on a subpage of the article talk page. (It would make more sense to store it on a mainspace subpage, but these don't exist.) With a small modification to the Persondata template (see User:Geometry guy/Persondata) it is possible query the Persondata via straightforward transclusion of the subpage. I have made a "proof of concept" at Alexander Grothendieck and Talk:Alexander Grothendieck/Persondata.

Straightforward transclusion of the subpage produces the Persondata table. This may be a problem for search methods which query the wikisource of the article, but I would question whether the latter is the best way to query this data, especially if it involves downloading the entire article.

On the other hand, transclusion of the subpage with a key allows for easy extraction of the data. For example

{{:Talk:Alexander Grothendieck/Persondata|key=birthdate}}

produces

(1928-03-28)March 28, 1928Expression error: Unrecognized word "march".

with not an SQL query in sight. This can be used to transclude DEFAULTSORT and infobox information into the article, allowing these data to be combined with the Persondata without requiring editors to use infoboxes if they don't want to.

Furthermore, the data on the subpage could be richer than in the Persondata table. I have illustrated this by allowing both the sortname and the usual name to be transcluded. The latter is often the name of the article, so this may not be so useful, but it is not difficult to imagine other applications of the same idea. Indeed one can imagine the infobox template automatically transcluding almost all of the infobox information from this subpage, removing clutter from the wikisource of the article. Geometry guy 14:43, 15 June 2007 (UTC)[reply]

I've now also produced a {{ReadPersondata}} template to make it simpler to include Persondata into an article: in the article itself (or on its talk page) one can use

{{ReadPersondata|key=birthdate}}

instead of the above. Geometry guy 16:05, 15 June 2007 (UTC)[reply]

It looks good. Effectively, you are implementing the "new tab" thing discussed above, but putting it on a talkpage subpage instead. One thing though - on the subpage, the metadata is not visible. Is there a way to make it visible so people don't have to click "edit" to see what is there? Also, please see Wikipedia:Bots/Requests for approval/Polbot 3 and User:Polbot/ideas/defaultsort for the rapidly advancing ideas of using a bot to standardise the existing data. That will still encounter the problem of location, as people will still have to update article metadata and sort keys in different locations, and would need to be re-run at intervals. Your proposal would solve this. The problem is which to do first. I'd say do the bot run first (which will also show the scale of the problem), and while that is happening, get this idea of your advertised more widely. Who knows, if the right developers hear about it, they might implement a metadata tab so we don't have to use subpages of talk pages! Carcharoth 16:55, 15 June 2007 (UTC)[reply]

There should also be a way to add references to confirm that the metadata (such as birth date and place of birth) is correct. How to do this? Carcharoth 16:56, 15 June 2007 (UTC)[reply]
Actually, this is one reason why I think the data should be in the article. People need to be able to edit things directly. If they press edit and instead of "15 April 1955", they see {{ReadPersondata|key=bithdate)), then that will be very offputting. It is offputting enough for infoboxes at the moment. More templatization of articles would be bad. I can see why consolidating the sortkeys would be a good idea, but I think that it should all centre on DEFAULTSORT. Not sure quite where the solution lies. Carcharoth 17:14, 15 June 2007 (UTC)[reply]
I don't see any easy way to extract the sortable name from DEFAULTSORT: I view this as an application of the sortable name, rather than its source. For example, sortable names can also be used in tables, not just categories. Geometry guy 18:05, 15 June 2007 (UTC)[reply]
Yes, you are right, forget that quibble of mine. Carcharoth 23:16, 15 June 2007 (UTC)[reply]

It is no problem to make the metadata visible on the subpage. (Actually it is already visible to those who've customised their CSS to view person data.) References could also easily be added on the subpage. However, in my view, any information in infoboxes requiring verification should also be in the body of the text. It should be possible to make the infobox not at all offputting: it might just be {{Infobox_President}} for example, with all the data automatically transcluded from the subpage. There could be an edit tab on the infobox which links to edit the subpage.

Concerning what to do: the first thing is not to rush to conclusions, but to think through the various ideas before doing anything; it is probably also a good idea to separate (at least mentally) information gathering from manipulating data. I noticed the rapidly advancing plans you mention already; they look very interesting and I was intending to comment further there soon. Some bot work will surely be needed both for information gathering and data migration, but there are nearly 400000 articles to play with here, and when planning a journey of such a scale, it is invaluable to have a clear idea of the destination. Geometry guy 18:05, 15 June 2007 (UTC)[reply]

I've now made the subpage visible. For this I needed to make the use of the subpage for generating the (invisible) Persondata table explicit. Together with the above discussion, this suggests to me that the subpage could be used to store arbitrary metadata, and the key parameter can be used to extract the data which is needed for a particular purpose (such as the Persondata table). Geometry guy 20:12, 15 June 2007 (UTC)[reply]

Is there any reason that the sub-page (via the template) couldn't also include an hCard microformat? I'd be happy to supply the necessary mark-up. Andy Mabbett 20:34, 15 June 2007 (UTC)[reply]
What would you think about having it at Persondata:Alexander Grothendieck instead? – Quadell (talk) (random) 22:42, 15 June 2007 (UTC)[reply]
These are not valid namespaces at the moment, and so I expect they are viewed as articles (and so artificially inflate the number of WP articles). Geometry guy 23:00, 15 June 2007 (UTC)[reply]
That's fine. It will certainly be made into a valid namespace if this proposal is widely followed. And it will only be widely followed if it's intuitive and easy to use. Metadata:Alexander Grothendieck is a lot more intuitive than Talk:Alexander Grothendieck/Persondata. (Besides, wouldn't this artificially inflate the talkpage count?) – Quadell (talk) (random) 23:08, 15 June 2007 (UTC)[reply]
Sorry if my comment gave the wrong impression, Quadell, as I'm definitely with you in spirit. For instance, I think that disabling subpages in the mainspace is the wrong way to enforce the (sensible) policy of non-hierarchical article format. (Talk page subpages are not disabled, and so they don't inflate the talkpage count.) I agree entirely that this is about metadata in general, not just Persondata: the latter is just one application: I hope you notice this in my more recent comments and edits.
The ugliness of Talk:Alexander Grothendieck/Persondata was the main reason I introduce {{ReadPersondata}}! ({{/MetaData}} would work for me as an article subpage if these were acceptable.) But I am a pragmatist, and we have to build our ideas within the current framework. As you say, if they are successful and intuitive, they may attract a wider attention and a cleaner formulation. Geometry guy 23:38, 15 June 2007 (UTC)[reply]

My answer to Andy would be that hcard format could be included in the processing of the data, but not on the subpage itself, since this data needs to be updatable by any editor. It would be easy, however, to build another subpage which transcluded primitive data into the hcard format. Geometry guy 23:00, 15 June 2007 (UTC)[reply]

I'm not sure why you think that using a microformat would affect an editor; all a microformat is is HTML classes in the rendered out put, they do not appear on the page when editing - have a look at any page using {{infobox biography}} for instance, which has hCard markup in the template, where it is invisible to anyone editing such articles. (I also like the idea of the page being called Metadata:Alexander Grothendieck, BTW). Andy Mabbett 08:22, 16 June 2007 (UTC)[reply]
Thanks for the explanation, although I'm not completely sure I have understood. Most editors do not need to edit infobox templates, so they can contain all sorts or markup. However, editors will need to edit metadata, so this needs to be stored in a simple format somewhere (not in the article). This simple format will be something like {{MetaData|data1=|data2=...}}. The {{MetaData}} template now has quite a lot of work to do. First it must display the data in a simple form on the metadata page itself. Second it must allow queries to extract individual data items. Third (optionally) it could allow queries to output some or all of the data in a particular format (such as a Persondata table). The third of these obviously allows any format.
If I understand correctly, however, you are asking for the display on the metadata page itself to be wrapped in hCard markup. This could certainly be done if it is useful. Geometry guy 10:17, 16 June 2007 (UTC)[reply]
"you are asking for the display on the metadata page itself to be wrapped in hCard markup" - yes, that's right. Andy Mabbett 10:23, 16 June 2007 (UTC)[reply]
Thanks Andy. I will try it out. Could you explain to me (perhaps on my talk page) what the hCard markup is for and how it is used? Geometry guy 10:27, 16 June 2007 (UTC)[reply]
Please try What are microformats? and hCard, and let me know if you have further questions. Andy Mabbett 10:33, 16 June 2007 (UTC)[reply]
  • I think it's important to state some things about the philosophy of metadata out front. Like:
    • Visible information in (non-template) article-space should never come from metadata. The text "Born {{Getmetadata|birthdate}}" should never appear, for instance. It is only used for categories, or in templates and such.
    • The Wikipedia article is the source of the metadata, by definition. There should never be metadata information which is not mentioned (and, ideally, sourced) in the article itself. That way we don't have to worry about cites in metadata -- the source for the info is the Wikipedia page. Conversely, if there is a discrepancy (not caused by vandalism), it's safe to assume that the article is right.
    • Templates such as infoboxes could be simply transcluded into articles without parameters. These templates would use magic words to point to the metadata. The [edit] link on infoboxes should go to the article's metadata, not the template (since these templates are, let's be honest, to complex for most users to edit anyway.)
    • It should be as simple as possible for users to find and edit metadata.

I'd actually suggest moving this discussion to someplace more centralized. Maybe Wikipedia:Separate metadata? Perhaps categorized under Category:Wikipedia proposals, with links from Wikipedia:Requests for comment/Style issues and Wikipedia:Centralized discussion? I don't want to get the conversation bogged down with too many opinions and ideas, but on the other hand, this would effect a huge portion of Wikipedia if implemented. – Quadell (talk) (random) 23:08, 15 June 2007 (UTC)[reply]

I agree entirely with your bullet points about metadata. The metadata should be taken from the article and stored in one place (so that if the article is wrong and needs to be updated, it is easy to fix the data). Then it should be used to generate persondata, infoboxes, etc. And of course edit links should point to an easy-to-comprehend metadata page, not a complicated template! Geometry guy 00:01, 16 June 2007 (UTC)[reply]
  • Responding to Quadell about namespace: do you mean a new namespace with its own talk page, or do you mean a new tab associated with articles (like the current talk pages?). If a new namespace, you have the problem of what happens when the article is moved to a new name. If a tab, then that would move with the page, and issues would be discussed on the talk page as normal. If you are going to have a new tab or new namespace, why not just go the whole hog and make it available for all metadata (including the hCard format Andy Mabbett mentioned above)? BTW, is that really a new namespace, or have you just created an article page with the title "Persondata:Alexander Grothendieck"? :-) A think a new tab is the most viable method, but unfortunately that would also be most developer resource-intensive. Carcharoth 23:16, 15 June 2007 (UTC)[reply]
At the moment this is simply a new article page in the mainspace. I imagine the intention would be to have a new tab associated with articles. Meanwhile, we have to develop the concept with the software as it is. Geometry guy 10:24, 16 June 2007 (UTC)[reply]
  • Responding to Geometry guy's comments, thanks for making the subpage visible. One query though - why isn't the sortkey parameter visible? Is that because the original persondata template doesn't include that parameter yet? About the references, you are quite right, for this sort of data that should (in fact must) be in the main text of the article as well as an infobox, the references stay with the article. I was thinking more of the kind of data included in infoboxes for things like articles on chemical elements or planets. See hydrogen and Earth for examples, though they have their own ways of dealing with their data. We should probably stick to biographical data for now! The idea of an edit tab linking to the subpage is a great idea. You've actually answered all my worries so far, and I agree entirely about taking it slowly and getting an idea of what is needed first. So, what next? Carcharoth 23:16, 15 June 2007 (UTC)[reply]
The sortkey parameter isn't visible because I was lazy and just copied the format from the persondata table. However, now that the persondata behaviour is separated from the subpage data, anything is possible. The template can display the data on the subpage however you want it to, and still provide a query mechanism to access the individual fields, and also constructions built from these fields, such as the persondata.
Regarding infoboxes, I think we have to rely on the specific infobox to decide how to handle the data. They are all very diverse, but they might all benefit from transcluding subpage data. The wikilinking and formatting of this raw data, should, however, be left to the infobox template.
The subpage idea is still rather dominated by the initial motivation from the Persondata template. It is becoming more flexible now, and I will continue to work in that direction. Geometry guy 23:53, 15 June 2007 (UTC)[reply]
  • Responding to Quadell again, after edit conflict, I agree a separate page to discuss the wider issues of metadata is needed, but surely this has been discussed elsewhere before? This may even be a perennial proposal, though maybe no-one's ever taken it this far. Carcharoth 23:16, 15 June 2007 (UTC)[reply]

Update

I've now added more data to Talk:Alexander Grothendieck/Persondata and shown how this data can be used to generate the infobox. There is also a link in the infobox to allow the data to be edited. This is all very much an experiment and a work-in-progress, so please add comments. Geometry guy 22:09, 16 June 2007 (UTC)[reply]

There's now an hCard, too. Andy Mabbett 22:13, 16 June 2007 (UTC)[reply]

Tidying up a few pages

Should the pages at Template talk:Persondata/Removing data have persondata or not? If not, please help tidy them up. Thanks. Carcharoth 01:03, 16 June 2007 (UTC)[reply]

Just remove them. Persondata belong only to biographys (only one instance per page) or on redirects if there's only one article covering several people (f.e. twins, who're only notable as twins but not as single persons). --32X 01:30, 16 June 2007 (UTC)[reply]

Persondata tagging script

In case anyone missed it, see Wikipedia talk:Persondata#Half-automatic tagging with persondata-tool for details of a script to help extract and add persondata. Carcharoth 01:10, 16 June 2007 (UTC)[reply]

Link to the archived discussion: Wikipedia_talk:Persondata/archive2#Half-automatic_tagging_with_persondata-tool --Rajah 08:09, 17 June 2007 (UTC)[reply]

Metadata in biography infoboxes

What's to be done about projects like WikiProject Composers and WikiProject Opera, where a cabal are insisting, against the evidence, to have a consensus for the removal of biographical infoboxes from all of "their" articles? Andy Mabbett

Forum-shopping, Mabbett. Isn't it time to let this go? Moreschi Talk 09:02, 16 June 2007 (UTC)[reply]
Not forum shopping; this is perfectly on-topic - and relates to previous discussion - here. Have you read that discussion, or are you just following me around? Andy Mabbett 09:34, 16 June 2007 (UTC)[reply]
The advantage of a separate MetaData page is that if editors of an article want to remove the biographical infobox, they can. The data remains available on the MetaData page. Hopefully this will reduce conflict of the kind that I am sensing here! Geometry guy 10:36, 16 June 2007 (UTC)[reply]
I thought that the proposal was to generate the metadata page using data from the infobox? Andy Mabbett 10:59, 16 June 2007 (UTC)[reply]
Initially, there may need to be a data migration exercise to create metadata pages from current infobox and persondata tables. However, thereafter, the proposal is to generate the infobox (if desired) and the persondata (if needed) from the metadata. Indeed, the whole point of the proposal is that it is much easier to generate the infobox from the metadata than vice-versa. Geometry guy 12:54, 16 June 2007 (UTC)[reply]
That seems reasonable, so long as there is only one place where the data is entered or amended. Andy Mabbett 13:16, 16 June 2007 (UTC)[reply]

Granularity of fields

Is there any reason why a person's name is held as a single field, and not as, say, "family name" "given name", etc? Andy Mabbett 10:25, 16 June 2007 (UTC)[reply]

No. This is an example of what I had in mind by the potential for the Person/Metadata subpage to refine the information in the standard Persondata table. The dates of birth and death could be refined in a similar way, if desired. Geometry guy 10:39, 16 June 2007 (UTC)[reply]
Good, separate day, month, season and year fields would greatly facilitate display of date of birth (and of death, as planned) in hCard, which requires the YYYY-MM-DD format; and which, if DoB is given as, say, "spring 1456", should only be output as "1456". perhaps we might also consider adding "honorific prefix" and "honorific suffix" fields (for use in , for example, "Sir Jim Smith, OBE"). A handy list of hCard properties may be found at http://microformats.org/wiki/hcard-cheatsheet Andy Mabbett 10:56, 16 June 2007 (UTC)[reply]

Location

What was the rationale behind telling people to put it before the cats and IWs? Seems far more logical to be at the very bottom of everything (ie the Cats and IWs are displayed by default, and are more directly related to Wikipedia readers--metadata would make more sense separate from all the content, just like dab headers should be on the first line, ahead of all article content). It also often gets used in a way that adds excess whitespace to the displayed article--this could easily be avoided by putting it at the bottom. I do agree that the first directions (between Cats and IWs was worse, but I don't think before the Cats was the best "fix". Sohelpme 01:43, 17 June 2007 (UTC)[reply]

Presumably, that was because of technical reasons. See Wikipedia_talk:Persondata/archive1#Location_in_article, Wikipedia_talk:Persondata/archive1#place_inside_the_article. Hopefully, metadata templates can get out of the main article in the future... --Rajah 15:53, 17 June 2007 (UTC)[reply]
It only adds whitespaces when stub templates are following. ("By convention this is placed at the end of the article, after [...] the category tags, so that the stub category will appear last.") So better start changing stubs to articles. ;) --32X 20:23, 17 June 2007 (UTC)[reply]
All other Wikipedia language editions (I think) place cats and interwiki-links at the very bottom. Interwiki-links are frequently added/changed by users from other language editions, or by interwiki-bots. It would be quite difficult for them if they had to look up the individual rules for placing the interwiki-links in each language edition. --88.134.44.255 02:18, 19 June 2007 (UTC)[reply]

Name order, etc etc

We read:

When specifying the person's name, use the following format: [surname], [forename] [middle names], [title].

This may need further thought.

Mao Zedong becomes "Mao, Mao", at least for those who rather reasonably presume that "forename" means the name that comes in front, and don't take the link to forename (a redirect to given name).

For those who do know that "forename" means given name, Mao Zedong becomes "Mao, Zedong". I thought that the comma was intended as a sign that the normal order had been reversed, but here it isn't.

I hazily remember or (very likely) misremember that (i) Vietnamese names are surname-last and (ii) Vietnamese people are referred to by their given names. It's very likely that I am totally confused here; but let's suppose for a moment that I'm right. Ho Chi Minh would then become "Minh, Ho Chi".

There are already comments above about the oddness of this template, or its instructions, or both, for Spanish names. Before the template (whose existence I only noticed today) is used tens of thousands more times, I suggest inviting people likely to know about names in Spanish, Russian, Chinese, Vietnamese, Hungarian (etc etc, but let's not labor the point) to discuss it. No doubt the short/medium-term result will be argument, confusion and frustration. Better to have that sooner than later.

(Incidentally, what about other scripts? There's no mention of these. Are Cyrillic, Hanzi, Hangul, etc. most welcome, tepidly welcome, permissible, or impermissible here?)

-- Hoary 02:51, 28 June 2007 (UTC)[reply]

The "[surname], [forename] [middle names], [title]" thing is a simplification that is only accurate for English-style names. It should be reworded to make it clear that the description should be written in a standard format for alphabetizing. This differs depending on the naming tradition. Mao Zedong would be listed as "Mao Zedong", since that's the standard way of alphabetizing Chinese names. Spanish names are different as well: Vicente Fox Quesada would be alphabetized either as "Fox Quesada, Vicente" or "Fox, Vicente, Quesada", I'm not sure which. Arabic names are a unique challenge unto themselves. Also, note that non-English characters are converted to English characters for the purposes of alphabetization: Võ Nguyên Giáp is alphabetized as "Vo Nguyen Giap" (since Vo is the family name). – Quadell (talk) (random) 03:49, 28 June 2007 (UTC)[reply]
P.S. For guidance on specific naming traditions and scripts, you might check out Wikipedia:Manual of Style (Arabic), Wikipedia:Naming conventions (Chinese), Wikipedia:Naming conventions (Cyrillic), etc. etc. etc.Quadell (talk) (random) 03:58, 28 June 2007 (UTC)[reply]
That's a good start.
I know, or can find out, how to name the people that I want to name. What I don't know is how to make this compatible with the hazily understood purposes of this template, let alone with the instructions that are given for using the template.
How about Dürer (German)? Is his name "alphabetized" as Durer (ugh) or Duerer?
Is "ë" a non-English character? If so, I suppose that "Brontë" is a non-English name, needing "alphabetization" as "Bronte".
So Vo is indeed the family name. (Ha, good! My memory isn't has bad as I thought.) However, he's General Giap, surely. (The WP article says e.g. "Giáp was educated at..." not "Vo was educated at...") How about academic books in English about Vietnam: do they index him under Vo or under Giap? If the latter, shouldn't this metadata thingie do the same, or have feature pointing out that he's not normally indexed by family name?
Offhand I can't think of any language/culture in which personal names are in such arrangements as [surname] [everyday given name] [optional additional given name(s)]. But I expect they exist, and thus "middle names" is a dodgy term too. -- Hoary 04:13, 28 June 2007 (UTC)[reply]
When people look up Emily Brontë, they expect to find her between "Brontalope" and "Bronticide", since e comes between a and i. But most sorting algorithms put special characters at the beginning or the end. That's why Dürer becomes Durer: not because it looks good (it looks terrible), but because people expect him after Dolly but before Dustbunnies. The reason for the NAME parameter of persondata is for listing and sorting. – Quadell (talk) (random) 04:27, 28 June 2007 (UTC)[reply]
Fine. But the page should explain this kind of thing. Of course I don't say "Drop all your other commitments and do this right now!", and I even see some merit in hiving it off into a sub-page, in order not to confuse Mr Average Wikipediaeditor, who will be almost exclusively concerned with "English" names. (Uh, hang on, Mr Average Wikipediaeditor appears to be disproportionately interested in [disproportionate] pornstars: Is Maxi Mounds to be parsed as [pseudo given name] [pseudo surname] or as [adjective] [noun]? Even the names of people of great interest to anglophones can be problematic.) -- Hoary 04:40, 28 June 2007 (UTC)[reply]
Very true. :-) And let's not forget video-game characters. "Kong, Donkey, Jr."? – Quadell (talk) (random) 04:45, 28 June 2007 (UTC)[reply]

The rule with metadata is to make it as flexible as possible for different applications. For instance, if you wanted to use metadata to produce an alphabetically sorted list, you might need a separate field for "alphabetical sortkey", and this would vary with different naming conventions. The "name" field, however, is something you would expect to read when seeing the entry in a list. This can be reconstructed from "given name", "middle name", "surname", but that can get complicated. Let's take an example to see what I mean. The Ferdinand Magellan example on the documentation page. The metadata there can be parsed to produce the following: "Ferdinand Magellan (Portuguese: Fernão de Magalhães; Spanish: Fernando de Magallanes) was a sea explorer who was born in 1480 in Sabrosa, Portugal, and died on 27 April 1521, on Mactan Island, Cebu, the Philippines." The same sort of sentences can be mechanically generated, and can incorporate things like titles (Sir, Lord, King), and post-titles (MBE, CBE), etc. If a metadata field is precise enough that it can be parsed into an English sentence, then it is usually good enough to be used in most other applications. Carcharoth 19:53, 11 July 2007 (UTC)[reply]

Ironically enough, User:Polbot (a bot run by Quadell) is doing the reverse for politicians and obscure animal and plant species. The entries on a website he is using are generated from a database, and Polbot is (I presume) just parsing out the information and constructing approximately grammatical English "articles". The results can be impressive. Carcharoth 19:56, 11 July 2007 (UTC)[reply]

Along the lines of that Polbot, shouldn't the default name value be the same as the DEFAULTSORT, if there is one? That's what I do if it's unclear (taking into consideration knowledge of that culture's naming scheme, of course). --Rajah 07:19, 13 July 2007 (UTC)[reply]

Links

Within the persondata template, should we wikilink things like the name of a state or country or dates?Rlevse 02:41, 19 July 2007 (UTC)[reply]

Yes, you should, if possible. From the project description: "Wikilinks in the persondata are not currently necessary; however, they may be useful for some future application." I almost always add them, usually because I am just copying the already wikilinked first few lines of the bio. Having these named entities wikilinked will allow for cool applications in the future. --Rajah 16:39, 21 July 2007 (UTC)[reply]
Interesting question, since at the German Wikipedia the rule's a bit different: Link only the place of birth/death, because someone's birthplace might be Madrid, but not Spain. (Actually the rule currently isn't enforced very well.) --32X 00:15, 22 July 2007 (UTC)[reply]
I'm not sure what you mean, by "only the place of birth/death". If someone is born in Madrid, Spain, then they were born in Spain too. Yes, Istanbul was Constantinople was Byzantium, but whoever adds the persondata for people born in that city in the 1900s, the 900s and/or antiquity will add the correct name. No ancient Greek was born in Istanbul after all. For Mikhail Baryshnikov, his birthplace seems right. Riga, Soviet Union, even though Riga is now in Latvia and the Soviet Union no longer exists. If he were to return to his birthplace and expire, his deathplace would be Riga, Latvia. Right? Rlevse's question also touched on dates, and I'd like to add the point that retaining the wikilinking for professions and nationality in the short description field will also retain info that can be used for future applications. --Rajah 00:32, 22 July 2007 (UTC)[reply]
The argument was while Madrid is the birth place, Spain is not a place in that meaning.
For the other example, you've mentioned it how it should be done. My favourite example: It's not interesting if someone was born in Berlin in the 1960s/70s/80s. In this case in which part of the city is was is the important information. --32X 15:11, 22 July 2007 (UTC)[reply]
Yeah, I think we're basically in agreement here. For those Berliners born in those decades, seeing birthplaces as Berlin, West Germany or Berlin, East Germany would seem to clear it up. (and shows that the political entity does store valuable information. --Rajah 06:31, 23 July 2007 (UTC)[reply]

Granularity of names

FYI: WikiProject Infoboxes: Granularity of people's names. Andy Mabbett | Talk to Andy Mabbett 14:59, 31 July 2007 (UTC)[reply]

Resting places of dead people

The templates {{Infobox person}} and {{Infobox actor}} now have parameters for resting place and resting place_coordinates (see Marylin Monroe for an example using both). These are included in the hcard microformat. I think we should add them in PERSONDATA, too. Andy Mabbett | Talk to Andy Mabbett 12:12, 19 August 2007 (UTC)[reply]

That's not a very good idea. The persondata is a set of few but rudimental information about a person. The resting place isn't rudimental and not even known for the majority of people covert by biographies. I also fear the result of such an information for Frederick I, Holy Roman Emperor. --32X 22:03, 19 August 2007 (UTC)[reply]

Metadata standardization

Please see Wikipedia talk:Metadata standardization. Thanks! Kaldari 23:25, 27 August 2007 (UTC)[reply]

Too much deference to computers -- Death data fields visible for living people

I understand the efficiency, etc., of not removing the "DATE OF DEATH" field for people are living, but it strikes me as unseemly.

It's easier to write software to extract data from data sets where a keyword exists for all expected data fields. But aren't humans the primary audience for Wikipedia?

It may serve computers well for the form to say the equivalent of "DATE OF DEATH= PENDING", but I think that it's a downer. -- Ac44ck 22:57, 11 September 2007 (UTC)[reply]

Respice post te! Hominem te esse memento! Dsp13 00:08, 12 September 2007 (UTC)[reply]
I don't know Latin.
It seems to mean:
Look behind you! Remember that you are but a man!
According to this:
http://en.wikipedia.org/wiki/Memento_mori
Two comments:
1. I prefer to look ahead, but not _that_ far ahead.
2. I hope that "but a man" is not suggesting that machines are to be elevated above humans.
When reading up on a noteworthy contemporary (or a vibrant, world-class athlete half my age -- which is where I first encountered the visible "DEATH PENDING" notice), I don't necessarily want to be reminded of their (or my) eventual death. I don't see that it does anything for the human-readable content of the article -- it only serves a data-parsing function for computers.
What reader-focused purpose is served by bringing attention to the mortality of someone who is at the top of their game?
-- Ac44ck 01:32, 12 September 2007 (UTC)[reply]
Sorry for the flippant reply. I guess personally (living in a rich bit of the world, where death's not very visible) I'm more worried that I might forget the general fact of human mortality than that I might be reminded of it too much. (I certainly don't rate machines above humans: only humans can be conscious that they will die!). But people's sensitivities vary (& my talk page would be a better forum to have that conversation) - so sorry if I caused offence. As far as persondata death fields go, though, persondata (unlike other bits of the page) is indeed basically meant for machine reading rather than human reading - which is why the default stylesheet preference is to make it invisible. Dsp13 15:14, 12 September 2007 (UTC)[reply]
So the millions of readers of countless biography pages will encounter non sequitur "DEATH PENDING ... just you wait -- if you don't go first" notices among the praises for some noteworthy, living person for the sake of the occasional programmer who can't be bothered to write a "If it isn't there, don't read it" routine.
It seems like a bad trade off to me.
Ac44ck 20:34, 13 September 2007 (UTC)[reply]
Maybe I'm missing something, but persondata should not be visible to someone reading an article unless they have specifically modified their user CSS file (e.g. monobook.css) to make it visible. Which article did you first see the DEATH PENDING (or whatever) field in? Maybe there's a an error in the syntax on that page making it visible. Dr pda 22:06, 13 September 2007 (UTC)[reply]
It does, indeed, disappear _if_ I allow the CSS file to be rendered. I wasn't aware of that, but it doesn't change my position.
I usually browse with as many bells and whistle turned off as I can manage -- webmasters frequently use unwelcome color schemes. Such is the case on the page where I first encountered the persondata template: bright white background if the CSS file is allowed to be rendered.
Who says that a bright white background is "normal"? It's glaring and promotes eye strain. As I understand it, an amber screen provides the optimum clarity and comfort. I have configured my system to show text on an amber-ish background. I find that works well for me. The Windows duh-fault is a glaring white background -- and many lemming webmasters follow suit when specifying colors in their CSS.
So I browse with CSS rendering turned off. Hence, after reading about the impressive list of accomplishments by a young, healthy, world-class athlete here:
http://en.wikipedia.org/wiki/Anna_Kournikova
The parting shot was, "Oh, yeah ... and we're waiting for her to die."
Eewwww!
The "Infobox" template has a place for a death date, and it is waiting in the wings to be filled in -- but it does so discretely.
I think it is inconsistent with other Wikipedia behaviors for the persondata template to await the inevitable so obviously -- and in ALL CAPS.
Perhaps the current situation is helpful to computers. But computers don't write checks to the Wikimedia Foundation; people -- who have "sensitivities" -- do.
Ac44ck 03:46, 14 September 2007 (UTC)[reply]
Yeah. But most of them won't see it. Most people go for the duh-fault. Carcharoth 04:01, 14 September 2007 (UTC)[reply]

Added instructions for extraction from database dump

Hi all, I've added instructions for how to extract persondata from the database dump and load it into a MySQL database. I've also compiled some lists of articles with problems in their persondata syntax, these can be found at User:Dr pda/Sandbox. Feel free to help tidy these up. There are over 1000 articles where the name has been entered in the form John Smith rather than Smith, John, so perhaps further education is needed somewhere. Dr pda 23:45, 11 October 2007 (UTC)[reply]

Nice work! I totally agree that we need to make it easier for the people who are new to Persondata. --Rajah 06:20, 16 October 2007 (UTC)[reply]

We need to rewrite/revise/update the instructions

Especially, how to deal with things like inexact birth and death dates. e.g. Are "1480/1", "late 1500s", "c. April 25, 1994", etc. acceptable? If not, what should be there? (I know a lot of this answered in the archives to this talk page. When I get time, I'm gonna go back and collate all the suggestions/pronouncements and place them where they should be (IN THE INSTRUCTIONS).) Also, reminding people that Last name first holds in alternative names as well. (I've been guilty of that myself.) Etc. I think the best way of showing those new to Persondata how to add it would be about 4-5 examples of common things that occur: People who are still alive, people with unknown birthdates/years, people without simple "LastName, FirstName" orderings, etc. --Rajah 06:19, 16 October 2007 (UTC)[reply]