O Say Can You See Data Posting
OSCYS (or Early Washingon DC) is a unique site because not only is there a map component, but there are a huge amount of relationships stored as RDF data. To update this site, you will need to run a specific script to regenerate these relationships.
Behind the scenes, it is good to know that OSCYS was an early Rails site, and as such does NOT rely on the CDRH’s API, but rather on a proto-API in Solr. Additionally, the interaction between documents, cases, and people in OSCYS is pretty complex and so we use Ruby to generate the Solr files rather than relying on XSLT.
Unlike other projects, the maps will be automatically generated if you run any HTML generation. However, if you would like to update ONLY the maps, you may simply specify -f csv.
Update Development
Log in and pull to development
Step 1: Log in
SSH into the development server and navigate to the data repository. [More info: if this is your first time logging in, you will need to set up server access.]
ssh username@cdrhdev1.unl.edu
cd /var/local/www/data/collections/oscysStep 2: Check for changes
Check to see which branch is currently checked out and see if there are any unexpected changes.
git statusTypically, you should be on the dev or main branch. [More info: What to do if it is on a different branch.]
If you're on your desired branch, but there are outstanding changes, you will need to deal with them before you pull. [More info: There are files changed on the server.]
Step 3: Pull!
Once you are on the correct branch, there are no outstanding changes, and everything is copacetic, then you can pull updates.
git pull origin [current branch name]You should either get a message saying "Already up to date" or you should see a bunch of new files show up with no errors. [More info: What if there is a problem when I git pull?]
Step 4: Upload images and other media
If your project uses images, audio, or video, you will need to upload any files that are connected with your new documents! TODO figure out some instructions knowing that most people use SFTP clients and that this isn't always on the same server oh no.
You will need to put your files in the following location on cors1601.unl.edu:
/var/local/www/media/oscys/[relevant directory]If you REPLACE any images and you do not see the new version show up on your site, this is because the IIIF image server caches images and you will need to either wait (a few days) or ask a CDRH dev to purge the cache.
Now you should be ready to begin updating the development website's contents!
Update the development site
Step 0. Generate HTML
Yes, step 0, because most sites with Solr don't use pre-generated HTML so consider your site special enough to have its own unique step!
Let's generate the new pages for your site before we update the search, and that way they will already be in place and ready to be discovered.
post -e development -x htmlIf you are only adding a few new files, you may want to regenerate only those which were updated when you did the git pull earlier:
post -e development -x html -u todayThis script may take some time to run. [More info: troubleshoot if there are errors with the HTML generation.]
Step 1. Clear Solr (optional)
If you have removed any files or changed identifiers, you will need to clear the old, no longer used file from the Solr index. If not, skip to Step 2.
You may either clear either one specific file or ALL the files. Keep in mind that if you clear all the files, the site will have no content until you repopulate the search.
solr_clear_index -e developmentYou can also clear a specific file if you only need to drop one item from the index, but know that if you use an id like `10` you may be clearing more items than you intend, so be specific if possible!
# clear one file from the index with -r [id]
# (do not include extension)
# (be specific with id to avoid accidentally removing more files)
solr_clear_index -e development -r wwa\.0001[More info: other options for clearing Solr by subcategory, etc]
Step 2. Populate the search!
This is the part where you get to add your new and updated files into your site's search!
post -e development -x solrThis may take some time to run, so if you are impatient, you may want to consider only posting specific files unless if you cleared the entire index in the last step. [More info: learn about posting by file type, date, and file name.]
Step 3: Update the relationships
If you have not updated the `rdf/oscys.relationships.csv`, you may skip this step. However, if you have downloaded a new copy of the Google Sheets document with person relationships, then add it as `rdf/oscys.relationships.csv`. If you do this locally and push, you can use `git pull` to move the new copy to the server.
From the root directory of the data repo, run the following command:
ruby scripts/csv_to_rdf.rbThis may take some time to run. Once it is done, it will have updated the file `rdf/oscys.relationships.ttl`. In a few steps when you are adding things to Git, you will want to make sure that you also add this file if you are happy with how everything looks.
Check the site
Go to https://cdrhdev1.unl.edu/earlywashingtondc/. Your changes should already be showing up there, although you may need to hard refresh to see them. [More info: what is a hard refresh?]
Make sure that you check not only the search results but an individual item page to make sure everything is shipshape.
Push changes (if needed)
Part 1: Check changes
After you've run your script, check if there are any files that have changed which you should commit.
git statusIn general, files generated for the development environment will not need to be committed, so do not be surprised if there isn't anything here!
Part 2: Commit and push
If there are expected changes, such as to output/development/html, or for project-specific features, you will need to commit them.
# you may run git add repeatedly to add different directories and files
git add [file path] [or directory path]
# check it over to make sure you're adding everything you need
git status
git commit -m "message about your changes"
git push origin [current branch name]Update Production
Log in and pull to production
Step 1: Log in
SSH into the production server and navigate to the data repository. [More info: if this is your first time logging in, you will need to set up server access.]
ssh username@cors1601.unl.edu
cd /var/local/www/data/collections/oscysStep 2: Check for changes
Check to see which branch is currently checked out and see if there are any unexpected changes.
git statusTypically, you should be on the dev or main branch. [More info: What to do if it is on a different branch.]
If you're on your desired branch, but there are outstanding changes, you will need to deal with them before you pull. [More info: There are files changed on the server.]
Step 3: Pull!
Once you are on the correct branch, there are no outstanding changes, and everything is copacetic, then you can pull updates.
git pull origin [current branch name]You should either get a message saying "Already up to date" or you should see a bunch of new files show up with no errors. [More info: What if there is a problem when I git pull?]
Step 4: Upload images and other media
This section is repeated for the sake of being thorough, but if you already have added your media files either during the development step or previously, you may skip this!
If your project uses images, audio, or video, you will need to upload any files that are connected with your new documents! TODO figure out some instructions knowing that most people use SFTP clients and that this isn't always on the same server oh no.
You will need to put your files in the following location on cors1601.unl.edu:
/var/local/www/media/oscys/[relevant directory]If you REPLACE any images and you do not see the new version show up on your site, this is because the IIIF image server caches images and you will need to either wait (a few days) or ask a CDRH dev to purge the cache.
Now you should be ready to begin updating the production website's contents!
Update the production site
Step 0. Generate HTML
Yes, step 0, because most sites with Solr don't use pre-generated HTML so consider your site special enough to have its own unique step!
Let's generate the new pages for your site before we update the search, and that way they will already be in place and ready to be discovered.
post -e production -x htmlIf you are only adding a few new files, you may want to regenerate only those which were updated when you did the git pull earlier:
post -e production -x html -u todayThis script may take some time to run. [More info: troubleshoot if there are errors with the HTML generation.]
Step 1. Clear Solr (optional)
If you have removed any files or changed identifiers, you will need to clear the old, no longer used file from the Solr index. If not, skip to Step 2.
You may either clear either one specific file or ALL the files. Keep in mind that if you clear all the files, the site will have no content until you repopulate the search.
solr_clear_index -e productionYou can also clear a specific file if you only need to drop one item from the index, but know that if you use an id like `10` you may be clearing more items than you intend, so be specific if possible!
# clear one file from the index with -r [id]
# (do not include extension)
# (be specific with id to avoid accidentally removing more files)
solr_clear_index -e production -r wwa\.0001[More info: other options for clearing Solr by subcategory, etc]
Step 2. Populate the search!
This is the part where you get to add your new and updated files into your site's search!
post -e production -x solrThis may take some time to run, so if you are impatient, you may want to consider only posting specific files unless if you cleared the entire index in the last step. [More info: learn about posting by file type, date, and file name.]
Step 3: Update the relationships
If you have not updated the `rdf/oscys.relationships.csv`, you may skip this step. However, if you have downloaded a new copy of the Google Sheets document with person relationships, then add it as `rdf/oscys.relationships.csv`. If you do this locally and push, you can use `git pull` to move the new copy to the server.
From the root directory of the data repo, run the following command:
ruby scripts/csv_to_rdf.rbThis may take some time to run. Once it is done, it will have updated the file `rdf/oscys.relationships.ttl`. In a few steps when you are adding things to Git, you will want to make sure that you also add this file if you are happy with how everything looks.
Check the site
Go to https://earlywashingtondc.org. Your changes should already be showing up there, although you may need to hard refresh to see them. [More info: what is a hard refresh?]
Make sure that you check not only the search results but an individual item page to make sure everything is shipshape.
Don't close this page yet! You're not quite done!
Push changes (if needed)
Part 1: Check changes
After you've run your script, check if there are any files that have changed which you should commit.
git statusPart 2: Commit and push
If there are expected changes, such as to output/production/html, or for project-specific features, you will need to commit them.
# you may run git add repeatedly to add different directories and files
git add [file path] [or directory path]
# check it over to make sure you're adding everything you need
git status
git commit -m "message about your changes"
git push origin [current branch name]More Information
Login and Pull
How to set up SSH
TODO
What to do if the git branch is not dev or main
TODO
There are files changed on the server
TODO
What if there is a problem when I git pull?
TODO cover permissions, merge conflict
Update the site
Clearing Solr by subcategory and more
TODO
Clearing Elasticsearch by subcategory and more
TODO possibly combine with above
Posting to Solr by specific file type, name, and updated date
TODO
HTML generation errors
TODO
Elasticsearch population errors
TODO
Check the site
Cocoon is broken
TODO
How to hard refresh
TODO