Family Letters Data Posting
Family Letters, or Cartas a la Familia, is a unique site in that the data repository populates the API and THEN requests information from the API to build GeoJSON. This means that you will need to check that you have filled out the following in `config/public.yml` or `config/private.yml`.
- api_request: true
- api_endpoint: https://cdrhapi.unl.edu/v1/collection/family_letters/item
If you choose to disable the API requests or skip this step, the map will not be updated even if letter or photograph locations have been altered or added.
Update Development
Log in and pull to development
Step 1: Log in
SSH into the development server and navigate to the data repository. [More info: if this is your first time logging in, you will need to set up server access.]
ssh username@cdrhdev1.unl.edu
cd /var/local/www/data/collections/family_lettersStep 2: Check for changes
Check to see which branch is currently checked out and see if there are any unexpected changes.
git statusTypically, you should be on the dev or main branch. [More info: What to do if it is on a different branch.]
If you're on your desired branch, but there are outstanding changes, you will need to deal with them before you pull. [More info: There are files changed on the server.]
Step 3: Pull!
Once you are on the correct branch, there are no outstanding changes, and everything is copacetic, then you can pull updates.
git pull origin [current branch name]You should either get a message saying "Already up to date" or you should see a bunch of new files show up with no errors. [More info: What if there is a problem when I git pull?]
Step 4: Upload images and other media
If your project uses images, audio, or video, you will need to upload any files that are connected with your new documents! TODO figure out some instructions knowing that most people use SFTP clients and that this isn't always on the same server oh no.
You will need to put your files in the following location on cors1601.unl.edu:
/var/local/www/media/family_letters/[relevant directory]If you REPLACE any images and you do not see the new version show up on your site, this is because the IIIF image server caches images and you will need to either wait (a few days) or ask a CDRH dev to purge the cache.
Now you should be ready to begin updating the development website's contents!
Update the development site
Step 1. Generate HTML
Let's generate the new pages for your site before we update the search, and that way they will already be in place and ready to be discovered!
post -e developmentIf you are only adding a few new files, you may want to regenerate only those which were updated when you did the git pull earlier:
post -e development -u todayIf your data includes TEI-XML files, this script will probably take a while to run since it will be transforming the files with XSLT. Other formats, like CSV, should be real snappy, though! [More info: troubleshoot if there are errors with the HTML generation.]
Step 2. Clear search (optional)
If you have removed any files or changed identifiers, you will need to clear the old, no longer used file from the Elasticsearch index. If not, skip to Step 3.
You may either clear either one specific file or ALL the files. Keep in mind that if you clear all the files, the site will have no content until you repopulate the search.
es_clear_index -e developmentYou can also clear a specific file if you only need to drop one item from the index, but know that if you use an id like `10` you may be clearing more items than you intend, so be specific if possible!
# clear one file from the index with -r [id]
# (do not include extension)
# (be specific with id to avoid accidentally removing more files)
es_clear_index -e development -r wwa\.0001[More info: other options for clearing Elasticsearch by subcategory, etc]
Step 3. Populate the search!
It's pretty straightforward to get the rest of your files into Elasticsearch, fortunately!
post -e developmentKeep an eye out for errors as it zips through your files. [More info: troubleshooting if there is a problem.]
NOTE: Your project uses a process called webscraping for at least some of your search results. This means that a script goes to pages of your website and grabs the text to use in search results. You may wish to check in `config/public.yml` and possibly `config/private.yml` to make sure that webscraping is enabled, or else these pages will not be updated.
Step 4: Update the map
You will need to run one final thing to update the GeoJSON which is powering your map.
post -f api_location -x htmlThis will recreate source/api_location/spatial.json and use it to create a file a GeoJSON file in the output directory.
Check the site
Go to https://cdrhdev1.unl.edu/family_letters. Your changes should already be showing up there, although you may need to hard refresh to see them. [More info: what is a hard refresh?]
Make sure that you check not only the search results but an individual item page to make sure everything is shipshape.
Push changes (if needed)
Part 1: Check changes
After you've run your script, check if there are any files that have changed which you should commit.
git statusIn general, files generated for the development environment will not need to be committed, so do not be surprised if there isn't anything here!
Part 2: Commit and push
If there are expected changes, such as to output/development/html, or for project-specific features, you will need to commit them.
# you may run git add repeatedly to add different directories and files
git add [file path] [or directory path]
# check it over to make sure you're adding everything you need
git status
git commit -m "message about your changes"
git push origin [current branch name]Update Production
Log in and pull to production
Step 1: Log in
SSH into the production server and navigate to the data repository. [More info: if this is your first time logging in, you will need to set up server access.]
ssh username@cors1601.unl.edu
cd /var/local/www/data/collections/family_lettersStep 2: Check for changes
Check to see which branch is currently checked out and see if there are any unexpected changes.
git statusTypically, you should be on the dev or main branch. [More info: What to do if it is on a different branch.]
If you're on your desired branch, but there are outstanding changes, you will need to deal with them before you pull. [More info: There are files changed on the server.]
Step 3: Pull!
Once you are on the correct branch, there are no outstanding changes, and everything is copacetic, then you can pull updates.
git pull origin [current branch name]You should either get a message saying "Already up to date" or you should see a bunch of new files show up with no errors. [More info: What if there is a problem when I git pull?]
Step 4: Upload images and other media
This section is repeated for the sake of being thorough, but if you already have added your media files either during the development step or previously, you may skip this!
If your project uses images, audio, or video, you will need to upload any files that are connected with your new documents! TODO figure out some instructions knowing that most people use SFTP clients and that this isn't always on the same server oh no.
You will need to put your files in the following location on cors1601.unl.edu:
/var/local/www/media/family_letters/[relevant directory]If you REPLACE any images and you do not see the new version show up on your site, this is because the IIIF image server caches images and you will need to either wait (a few days) or ask a CDRH dev to purge the cache.
Now you should be ready to begin updating the production website's contents!
Update the production site
Step 1. Generate HTML
Let's generate the new pages for your site before we update the search, and that way they will already be in place and ready to be discovered!
post -e productionIf you are only adding a few new files, you may want to regenerate only those which were updated when you did the git pull earlier:
post -e production -u todayIf your data includes TEI-XML files, this script will probably take a while to run since it will be transforming the files with XSLT. Other formats, like CSV, should be real snappy, though! [More info: troubleshoot if there are errors with the HTML generation.]
Step 2. Clear search (optional)
If you have removed any files or changed identifiers, you will need to clear the old, no longer used file from the Elasticsearch index. If not, skip to Step 3.
You may either clear either one specific file or ALL the files. Keep in mind that if you clear all the files, the site will have no content until you repopulate the search.
es_clear_index -e productionYou can also clear a specific file if you only need to drop one item from the index, but know that if you use an id like `10` you may be clearing more items than you intend, so be specific if possible!
# clear one file from the index with -r [id]
# (do not include extension)
# (be specific with id to avoid accidentally removing more files)
es_clear_index -e production -r wwa\.0001[More info: other options for clearing Elasticsearch by subcategory, etc]
Step 3. Populate the search!
It's pretty straightforward to get the rest of your files into Elasticsearch, fortunately!
post -e productionKeep an eye out for errors as it zips through your files. [More info: troubleshooting if there is a problem.]
NOTE: Your project uses a process called webscraping for at least some of your search results. This means that a script goes to pages of your website and grabs the text to use in search results. You may wish to check in `config/public.yml` and possibly `config/private.yml` to make sure that webscraping is enabled, or else these pages will not be updated.
Step 4: Update the map
You will need to run one final thing to update the GeoJSON which is powering your map.
post -f api_location -x htmlThis will recreate source/api_location/spatial.json and use it to create a file a GeoJSON file in the output directory.
Check the site
Go to https://familyletters.unl.edu. Your changes should already be showing up there, although you may need to hard refresh to see them. [More info: what is a hard refresh?]
Make sure that you check not only the search results but an individual item page to make sure everything is shipshape.
Don't close this page yet! You're not quite done!
Push changes (if needed)
Part 1: Check changes
After you've run your script, check if there are any files that have changed which you should commit.
git statusPart 2: Commit and push
If there are expected changes, such as to output/production/html, or for project-specific features, you will need to commit them.
# you may run git add repeatedly to add different directories and files
git add [file path] [or directory path]
# check it over to make sure you're adding everything you need
git status
git commit -m "message about your changes"
git push origin [current branch name]More Information
Login and Pull
How to set up SSH
TODO
What to do if the git branch is not dev or main
TODO
There are files changed on the server
TODO
What if there is a problem when I git pull?
TODO cover permissions, merge conflict
Update the site
Clearing Solr by subcategory and more
TODO
Clearing Elasticsearch by subcategory and more
TODO possibly combine with above
Posting to Solr by specific file type, name, and updated date
TODO
HTML generation errors
TODO
Elasticsearch population errors
TODO
Check the site
Cocoon is broken
TODO
How to hard refresh
TODO