Figuring out the Polish State Archive changes

After my earlier post Changes at the Polish State Archives about the closing of several important record databases at the Polish State Archives, it was pointed out that the database I directed people to use instead, szukajwarchiwach.pl, is also going to be shut down.

szukajwarchiwach.pl on left and szukajwarchiwach.gov.pl on right

It has been announced that that site will be replaced by szukajwarchiwach.gov.pl. The date for that transition has not been announced yet, but hopefully they will not do so before you can do everyone on the new site that you can do on the old site. I’m going to discuss two issues I have with the new site, one very significant, and one perhaps less so, but that still bothers me quite a bit.

You can’t get the same search results

As it currently stands, the new site cannot do the same kinds of searches as the old site. I pointed people to szukajwarchiwach.pl because I was able to show the exact same results from searches on both PRADZIAD and szukajwarchiwach.pl, even if the results were in a different order and format. It does not seem possible to do the same kind of searches on szukajwarchiwach.gov.pl.

For example, in my earlier article, I wrote about searching for all Jewish civil registers (birth, marriage, divorce, death, etc.). Both PRADZIAD and szukajwarchiwach.pl returned 3303 results:

3303 results from both PRADZIAD (left) and szukajwarchiwach.pl (right)

However, there is no specific way to do the above search on szukajwarchiwach.gov.pl. On PRADZIAD you can choose the denomination that the register was kept by:

Selecting denomination in the PRADZIAD search page

Similarly on szukajwarchiwach.pl in Advanced Search you can choose from a menu which inserts the denomination into the search:

Selecting denomination in the szukajwarchiwach.pl Advanced Search

However, on the szukajwarchiwach.gov.pl site, there is no selection menu to select denomination, or anything else really. You need to start with a text search. Only after you do a text search can you then filter out some results by some of the settings like denomination. The closest one seems to be able to do is to actually search for the word mojżeszowe:

which provides the following results:

Note that there are 3405 total results, which include 3266 ‘record files’ which I would guess correspond to the civil registers. Either way it’s not 3303 records. Moreover, if you do a simple text search for ‘mojżeszowe’ on szukajwarchiwach.pl, these are the results you get:

Note that there are 12321 results. That’s basically the same search that provided 3405 results on szukajwarchiwach.gov.pl. Moreover, if you restrict those 12321 results to ones categorized as belonging to the mojżeszowe denomination, there are only 865 results. The meaning of that is that only 865 of the total results that match the free-text search of ‘mojżeszowe’ are categorized as belonging to that denomination if you search with that setting. We already know that there are 3303 records that match that community, so a free-text search doesn’t seem to cover these records very well.

A lack of elegance

The new site, szukajwarchiwach.gov.pl, is a prettier and more modern site, but it lacks the elegance of the old site. What do I mean by elegance? If you wanted to send someone to look a specific fond on the site, it was fairly easy to do. To get to the page of a specific archive, you only needed to know the archive code (see my chart of PSA contact information for the codes of each archive). For example, the code for Przemyśl is 56. To see information about that archive you go to the following link:

https://szukajwarchiwach.pl/56

To send someone to a specific fond within that archive, you basically just add the fond number. For example to go to the fond that contains Jewish records from the town of Kańczuga (Fond 1731), you would go to the following URL:

https://szukajwarchiwach.pl/56/1731/0/

To list all the series in that fond, you just add the code (#tabSerie):

https://szukajwarchiwach.pl/56/1731/0/#tabSerie

To list all the sections (sygnatura) within that fond, you just add the code (#tabJednostki):

https://szukajwarchiwach.pl/56/1731/0/#tabJednostki

This simplicity made it very easy to share information with fellow researchers. You can even link to a specific sygnatura by adding the series and sygnatura numbers:

https://szukajwarchiwach.pl/56/1731/0/1/2

The added 1 is series 1 (birth records) and the 2 is the second sygnatura (out of the whole fond).

Now let’s see what these links look like on the new site.

To go to the Przemysl archive, you go to the following link:

https://www.szukajwarchiwach.gov.pl/en/web/archiwum-panstwowe-w-przemyslu/home

So the code is gone and the name of the archive is there instead. That’s not necessarily a bad thing, however, as an example archive #9 used to be Archiwum Państwowe w Elblągu z siedzibą w Malborku, and is now Archiwum Państwowe w Malborku – what would happen to the link in that case? On the old site, the link didn’t change. On the new site, I don’t know if the change happened before or after it went up, but if it did exist when the change happened, do the old links still work?

To share a link to the same fond on szukajwarchiwach.gov.pl, you use the following link:

https://www.szukajwarchiwach.gov.pl/en/zespol/-/zespol/128030

The first problem is the /zespol/-/zespol/ which is duplicated in the URL. Yes, zespol is the Polish word for fond, but the connection to the archive is gone. The bigger problem is the number 128030 which has no clear relation to Fond 1731 from Archive 56. Maybe there is some master list of fonds that this corresponds to, but I’m certainly not aware of any such list. What’s worse is that if the number has no relationship to the content, then there is a possibility that the number could change over time. This is a serious concern because that’s exactly what still happens with the PRADZIAD database where the page links change every few months when they update the database. Is that going to happen here? That would be awful.

Okay, so let’s say you want to show the list of fond series. This is how you do it on the new site:

https://www.szukajwarchiwach.gov.pl/en/zespol?p_p_id=Zespol&p_p_lifecycle=1&p_p_state=normal&p_p_mode=view&_Zespol_javax.portlet.action=zmienWidok&_Zespol_nameofjsp=serie&_Zespol_id_zespolu=128030

If you want to show the list of sygnatura, this is what you link to:

https://www.szukajwarchiwach.gov.pl/en/zespol?p_p_id=Zespol&p_p_lifecycle=1&p_p_state=normal&p_p_mode=view&_Zespol_javax.portlet.action=zmienWidok&_Zespol_nameofjsp=jednostki&_Zespol_id_zespolu=128030

I hope you’ll agree that these are not quite as elegant as the old site. That’s not it, however. If you want to link to same sygnatura as above (the second birth sygnatura), the URL is:

https://www.szukajwarchiwach.gov.pl/en/jednostka/-/jednostka/18219558

Okay, so we actually see a pattern similar to the original link to the fond. That’s a good sign. Like the link to the fond (zespol in Polish) this link duplicates the Polish word jednostka (/jednostka/-/jednostka/) and gives us yet another number that we neither know the origin of, nor know if it will remain permanent. I don’t know why there’s a need to duplicate the word, or add the /-/ in the middle, but worse is that to get the list of series or sygnatura you need to use those huge URLs, and that we don’t know what these numbers refer to at all.

One other positive sign is that the numbers at of the fonds and sygnatura at least seem to be sequential. That might indicate they are permanent, and hopefully that there is some master list of numbers that match up to fond and sygnatura in each archive. At least then the URLs wouldn’t change. Of course, if they’re sequential now, and new sygnatura are added to a fond, then they either won’t remain completely sequential, or they will change. Keep in mind that new sygnatura are added to fonds all the time, as records over a hundred years old move from their local civil registration office (urząd stanu cywilnego) to a regional archive.

Ironically, one improvement in the new site is the naming system for scan downloads. In the old site the name of the file was the number of the image in the sygnatura, while on the new site, the file name includes the number of the archive, fond, series and sygnatura, in addition to the image number. That means that you know exactly where that image originated once you download it. It’s ironic since the site doesn’t use these same codes for navigating to the information, but it does add it to the image names. Another related and welcome improvement is that you can download multiple scans at the same time.

If you want to take a look at how to navigate the new site, the blog From Shepherds and Shoemakers has a good overview titled Szukajwarchiwach Version 2.0: Better Than the Original! (although as you can see from the above I don’t agree completely with the title).

My suggestions to those working on the new site are the following:

  1. Add the selection menus available on the old site, and allow one to search using them without first doing a full-text search.
  2. Allow search results to be sorted better (such as alphabetically by town, archive, and then by sygnatura number).
  3. Allow search results to be formatted in a simple table with one piece of information per column and row. Even better, allow the user to choose which columns to show. Allow the table to be sorted by any column.
  4. Allow the user to export their search results as a CSV (UTF-8 please).
  5. Make the URLs cleaner and more consistent, like the old site. Never use numbers in the URL that have no reference and may change. If these numbers are permanent, then please publish a list to what each of these number corresponds.

1 thought on “Figuring out the Polish State Archive changes

  1. Thank you for comparing the sites. I, too, have experienced different search results at the sites. It’s not easy for any particular town to find the BMDs at the new site.
    Is there any easy way of doing so?
    Example, search for Brzeziny for the various denominations.

Leave a Reply