Basic Web Automation for Collecting Activity
-
Some people have asked for this and it has been a while since I knew where the code was stored but I stumbled on it the other day and figured that I would share. People are often curious how practical screen scraping is done. This is a very crufty example but it works and is easily automatable (throw it into cron and you are done.) No special tools needed:
#!/usr/bin/bash # # Version 1.0 # 11 January 2014 # # Abstract: To automate the collection of Spiceworks activity data at the current time through the direct querying of the SW Community ## Initialize temp file echo "" > /tmp/swreport ## Generate Data for i in $(cat /opt/scripts/staff); do echo $i $(curl -b /opt/scripts/cookies.txt http://community.spiceworks.com/people/$i/activity 2>/dev/null| grep Points \ | cut -d">" -f 3 | cut -d "<" -f1) $(curl -b /opt/scripts/cookies.txt http://community.spiceworks.com/people/$i/activity \ 2>/dev/null| grep Answer | cut -d">" -f 3 | cut -d "<" -f1) $(curl -b /opt/scripts/cookies.txt \ http://community.spiceworks.com/people/$i/activity 2>/dev/null| grep Posts| cut -d">" -f 3 | cut -d "<" -f1) \ $(curl -b /opt/scripts/cookies.txt http://community.spiceworks.com/people/$i/activity 2>/dev/null | grep '"title"' | \ cut -d"<" -f2 | cut -d">" -f2) | sed 's/,//g' >> /tmp/swreport done ## Format Report echo "Screenname Points BAs HPs Pepper" > /tmp/swreport.sorted echo " " >> /tmp/swreport.sorted sort -k2nr /tmp/swreport >> /tmp/swreport.sorted column -c 4 -t -s $' ' /tmp/swreport.sorted > /tmp/swreport.col echo "This daily report of Spiceworks standings is generated automatically by the swreport.sh script on to-lnx-dev. \ This report is created by directly querying the Spiceworks Community at the time of creation and is completely \ up to date at creation time." > /tmp/swreport.for echo "" >> /tmp/swreport.for cat /tmp/swreport.col >> /tmp/swreport.for ## Send Out Report mail -s "Spiceworks Daily Report - Straight from the Server" [email protected] < /tmp/swreport.for
The script is quick and dirty with hard coded locations. It requires the text file
/opt/scripts/staff
to contain the list of user names to query. Add as many names to the list as you want in the report. -
Before this script will run properly, you have to use curl and a valid account to acquire a cookie to pass with the script, as well.