Weekly DC Stats scripts

Help with projects and applications, client configuration, system software tweaks.
The pinboard for cheat sheets.
Skillz
Site Admin
Reactions:
Posts: 1925
Joined: Sun Sep 15, 2019 3:03 pm

Re: Weekly DC Stats scripts

Post by Skillz »

Does the script have to be run multiple times for a reason? Like are there instances where the script fails to do what it's supposed to do so running it every minute for an hour guarantees success at least once?
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

No, I just want it to run once. Maybe my syntax is wrong, I need to check it. I assumed that is was failing and retrying, sounds like I am wrong about that. I only want it to run one time a week, at 3AM on Sunday morning.

I will put the whole path of the output file as well.
Skillz
Site Admin
Reactions:
Posts: 1925
Joined: Sun Sep 15, 2019 3:03 pm

Re: Weekly DC Stats scripts

Post by Skillz »

Well the cronjob should look like this;

0 3 * * sun
or
0 3 * * 7

That's at 3am every Sunday, every week. It will only run once.

Cool site I use to check the cron scheduling is here:
https://crontab.guru/

Very useful for taking the "work" out of setting up a cronjob.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

Meh. Now I have something that is slightly closer to working. It runs, but it creates a blank text file instead of writing the output into it.

Code: Select all

0 3 * * 0 ./home/garage/Desktop/Weekly_Stats_Scripts/weekly_stats_dump.sh > /home/garage/Desktop/Weekly_Stats_Scripts/weekly_stats_dump_output.txt
There doesn't seem to be a problem with permissions. Running the command as a normal user from the root directory is successful.
Skillz
Site Admin
Reactions:
Posts: 1925
Joined: Sun Sep 15, 2019 3:03 pm

Re: Weekly DC Stats scripts

Post by Skillz »

It might be better to write in the script to output the results to a file.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

When I run that in a terminal window, it does write the results to the file specified. So if there is a better way, I don't know how.
Skillz
Site Admin
Reactions:
Posts: 1925
Joined: Sun Sep 15, 2019 3:03 pm

Re: Weekly DC Stats scripts

Post by Skillz »

I mean without having to use > to dump the output in the terminal to a file.

I'll check it out tonight and do some research on how to implement an output within' the script code directly. Gotta be a way.
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

Remove the dot before /home in the crontab line.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

StefanR5R wrote: Mon Jan 09, 2023 3:08 pm Remove the dot before /home in the crontab line.
Okay I will. Initially I did not have it in there, I'll take it back out.
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

Opening post edited:

weekly_stats_dump.sh:
– Don't look for OGR-28 updates anymore.

fah_weekly_stats_dump.sh:
– Use firefox's user agent string in the HTTP requests which are issued via the "links" browser.
– Insert a small delay between HTTP requests, to go easy on the folding.extremeoverclocking.com web server. This script issues only 3…4 requests though, so it's not been a big deal.
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

I edited the 1st post…
– Fixed fah_weekly_stats_dump.sh for ≥1 billion points/week/member.
…and the 2nd post:
– Updated the link to where I look up home teams of guest crunchers.
– Updated weekly_stats_dump_ukraine.sh with the current guests table.
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

1st and 2nd post updated: added WUProp to weekly_stats_dump.sh and weekly_stats_dump_ukraine.sh
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

I've been getting these stats scripts working again in the unlikely event that I might be called upon to be a substitute. But I'm unsure of my ability to check for changes in the Ukraine guest cruncher category.

Edit: Well, check and then properly implement changes to the script. It appears that the list gets updated in the OP as changes are discovered, so running the most recent version of the script would result in fewest errors even if a change to the list is not caught, that's what I might do if I need to sub.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

I got one computer working to do the stats and now I am copying the configuration to a second computer at another location. This should add enough redundancy so that I can help in the future if needed.

Ukraine stats would at worst contain the latest guest cruncher list posted to this thread.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

Well I seem to be doing better at the backup stats keeping, but I apparently do not have the proper character set installed to correctly render the name of whoever was 7th place in Asteroids for Team Ukraine.
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

I wouldn't worry about that. FreeDC and BoincStats and the Asteroids@home site all show different characters in this user name to me. If this is not surviving the data export from Asteroids or the data imports by FreeDC and BoincStats, it would be a bit much to compensate for that in our import from FreeDC.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

I'm pleased to report that I have the stats scripts running as cron jobs on two different computers at two different locations, so there is now high confidence I will be able to provide the stats if needed.
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

I shall take you up on that eventually. On June 2 for example, but I'll send a reminder when it's time. :-)
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

At worst I would use a guest cruncher list for Ukraine that might possibly be a week out of date, if it's safe to assume that you edit post #2 here when you detect changes in guest crunchers.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

Inaccuracies have accumulated in my version of the Team Ukraine weekly stats. These users are missing their usual team affiliation in my version:

Skip Da Shu...............Guru Mountain
Giovani Avelar............none
pkasatka....................Gridcoin
Krasavetz...................TSC! Russia
Graith.........................Gridcoin

I'm not super confident about fixing this on my own; the scripts are working and I'd like to keep it that way! Perhaps an updated version of the script could be posted so that my backup versions are congruent with your main ones once more?
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

I edited the 2nd post (only the CODE section) with my current version of weekly_stats_dump_ukraine.sh:
Some more guests, and minor improvements to the sed code for certain special cases of usernames.


Edit, adding more guests is most of the time just a trivial matter of adding a line of the form "username.......origin team" in the respective list. But I had to enhance the sed input code for edge cases occasionally, more recently for a name with "&" in it.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

Thanks, Stefan. While I imagined it was probably trivial to edit more guests/teams into the script, I am still much better at breaking scripts than fixing them. I'm glad you did it for me.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

I'm looking into the need to add support for Cyrillic characters in my Linux Mint installations.
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

I updated the 1st post, dnet_weekly_stats_dump.sh code section:
It now discards all output lines with less than 10 credits/week, but prints the total number of lines which got discarded this way. The previous version of the script did this already with lines with 1 credit/week. (Today for example there were 71 one-credit records, 5 two-credit records, and 1 three-credit record. I think they are all due to some sort of rounding bug in distributed.net's stats keeping, and should therefore be ignored.)

I also updated the 2nd post, weekly_stats_dump_ukraine.sh code section:
I added special handling of the member names Александр, João Mota, and райдужний. These names are already corrupted in the respective project servers' stats export files. (The actual character conversion method mismatches what the XML file headers say.) They only show correctly on the various web pages of the project servers, whereas 3rd party stats sites show them in different but wrong ways. The script now converts what FreeDC is making of them back into how they look at the projects' web pages. Unfortunately, I have not figured out an easy way to do this with any arbitrary name, which is why I coded this ad hoc for these three particular names only. More names can be added later manually in the same way.

Note, I used UTF-8 encoding when I wrote the script, and when I copy-and-pasted it here into the forum post.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

Well, that explains why I couldn't fix it. I've incorporated your changes, thanks!
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

I updated the 1st post, weekly_stats_dump.sh code section, as well as the 2nd post, weekly_stats_dump_ukraine.sh code section: Added the ODLK2025 project.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

Thanks, Stefan, I will update both of my computers that are performing backup stats. As a side note, I am lucky that I have been running redundant instances of the stats updates, because I lost an NVMe drive a couple of weeks ago on one of them. My luck with NVMe drives has been okay, but they don't ever seem to fail gradually. Some brief weird behavior, and then they are gone. It's making me think better about backing up my DC computers, which I don't do at present.
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

I updated the 1st and 2nd post yet again: "Goofyxgrid CPU" and "goofyxGrid@Home NCI" added to weekly_stats_dump.sh and weekly_stats_dump_ukraine.sh

The existing "Goofixgrid" entries in these script are apparently obsolete now, but I did not delete them as there is no pressing reason to do so.
crashtech
TAAT Member
Reactions:
Posts: 1571
Joined: Sun Sep 15, 2019 4:45 pm
Location: Idaho, USA

Re: Weekly DC Stats scripts

Post by crashtech »

I appreciate being entrusted to present the stats these two weeks, I hope you'll let me know if you spot any errors.
StefanR5R
TAAT Member
Reactions:
Posts: 1696
Joined: Wed Sep 25, 2019 4:32 pm

Re: Weekly DC Stats scripts

Post by StefanR5R »

I edited weekly_stats_dump.sh and weekly_stats_dump_ukraine.sh in the first and second post of this thread:
  • Added the projects BOINC Central and USPEX@Home.
  • Put the FreeDC web accesses into a retry loop. These should work on first try now, but it doesn't hurt to have the loop there.
Furthermore, here is an alternative to fah_weekly_stats_dump.sh from the first post.
  • The original fah_weekly_stats_dump.sh crawls the EOC web site to get weekly data.
  • The alternative below uses foldingathome.org's original stats export files instead. The downside of the latter approach is that one has to store copies of these export files from week to week, in order to be able to get the difference from one snapshot to the next.
  • My parser of foldingathome.org's dumps is not very sophisticated. It has to deal with newlines and tab characters in team names and in member names, but does so only in a rudimentary fashion.
    • It can handle team names which contain either 1 newline or 1 tab. These are the only special cases which I am aware of, but there may be more.
    • It doesn't handle member names with any number of newlines or tabs in them. I simply ignore records of such members entirely.
  • Also note that foldingathome.org's teams stats dump contains unique team IDs, which keeps things easy. But the members stats dump does not have unique IDs. Member names are not unique:
    • One F@H member may have switched teams during its participation. This results in separate records for the member at team X and at team Y. This is trivial to deal with for me, because I only look at member credits which are associated TeAm AnandTech.
    • But even within a single team, there are member records with same names. I don't know if these are indeed different members or if foldingathome has some duplicate records in its database dump for some reason. Instead of making things difficult for myself, I simply take the TeAm member record with the highest credits score and ignore all other records with same member name.
  • The implementation is split into a bash script which manages the file downloads and storage, and a python script which examines the files and computes the weekly stats.
fah_weekly_stats_dump.sh

Code: Select all

#!/bin/bash

team_id="${1:-198}"

cd "${HOME}/Distributed_Computing/Weekly_Stats/FAH" || exit 1

declare -x LC_ALL="en_US.UTF-8"

                   today="$(date +%d%^b%Y)"
             last_sunday="$(date -d 'last Sunday' +%d%^b%Y)"
      today_members_file="${today}.txt"
last_sunday_members_file="${last_sunday}.txt"
        today_teams_file="${today}_teams.txt"
  last_sunday_teams_file="${last_sunday}_teams.txt"

# delete superfluous data files
for file in [0-3][0-9][JFMASOND][AEPUCO][NBRYLGPTVC]20[2-9][0-9]{,_teams}.txt
do
        case "${file}" in
        "${today}"*) ;;
        "${last_sunday}"*) ;;
        *) rm "${file}" 2>/dev/null;;
        esac
done

# download and cache today's data
[ -f "${today_teams_file}" ] ||
wget -q -O "${today_teams_file}" 'https://apps.foldingathome.org/daily_team_summary.txt'
[ -f "${today_members_file}" ] ||
wget -q -O "${today_members_file}" 'https://apps.foldingathome.org/daily_user_summary.txt'

# compute the weekly data
../fah_weekly_stats_dump.py "${team_id}" \
	"${last_sunday_teams_file}" "${today_teams_file}" \
	"${last_sunday_members_file}" "${today_members_file}"
fah_weekly_stats_dump.py

Code: Select all

#!/usr/bin/python

import sys

if len(sys.argv) < 6:
	print("Usage:", sys.argv[0], "teamID prev_teams_file cur_teams_file prev_users_file cur_users_file")
	sys.exit(1)

team_id = sys.argv[1]
last_sunday_teams_file = sys.argv[2]
today_teams_file = sys.argv[3]
last_sunday_members_file = sys.argv[4]
today_members_file = sys.argv[5]

teams = {}
overall_rank = r = 0

f = open(today_teams_file)
lines = f.read().splitlines()
f.close
lines.pop(0) # export date
lines.pop(0) # table heading
for l in lines:
	t = l.split("\t")
	if len(t) == 2: # special case: teamname contains linebreak
		team, teamname = t
		continue
	if len(t) == 3: # continuation of above special case
		teamname2, score, wu = t
		teamname += teamname2
	elif len(t) == 5: # special case: teamname contains a tab
		team, teamname, teamname2, score, wu = t
		teamname += teamname2
	else:
		team, teamname, score, wu = t
	### print(team, teamname, score, wu, sep="|")
	r += 1
	if team == team_id:
		overall_rank = r
	teams[int(team)] = int(score)

print()
print(" Folding@Home overall position -", overall_rank)

f = open(last_sunday_teams_file)
lines = f.read().splitlines()
f.close
lines.pop(0) # export date
lines.pop(0) # table heading
for l in lines:
	t = l.split("\t")
	if len(t) == 2: # special case: teamname contains linebreak
		team, teamname = t
		continue
	if len(t) == 3: # continuation of above special case
		teamname2, score, wu = t
		teamname += teamname2
	elif len(t) == 5: # special case: teamname contains a tab
		team, teamname, teamname2, score, wu = t
		teamname += teamname2
	else:
		team, teamname, score, wu = t
	### print(team, teamname, score, wu, sep="|")
	if int(team) in teams:
		teams[int(team)] -= int(score)

s = teams[int(team_id)]
print(" TeAm total for the week -", '{:,}'.format(s))
r = 1
for score in teams.values():
	if score > s:
		r +=1
print(" TeAm rank for weekly production -", r)

del teams

members = {}

f = open(today_members_file)
lines = f.read().splitlines()
f.close
lines.pop(0) # export date
lines.pop(0) # table heading
for l in lines:
	t = l.split("\t")
	if len(t) != 4: # discard any records with tabs or linebreaks in usernames
		### print(pl)
		### print(t)
		continue
	name, score, wu, team = t
	### print(name, score, wu, team, sep="|")
	if team != team_id: # ignore members of other teams
		continue
	if name in members: # discard subsequent records of non-unique usernames
		continue
	members[name] = int(score)
	### pl = l

prev_members = {}

f = open(last_sunday_members_file)
lines = f.read().splitlines()
f.close
lines.pop(0) # export date
lines.pop(0) # table heading
for l in lines:
	t = l.split("\t")
	if len(t) != 4: # discard any records with tabs or linebreaks in usernames
		continue
	name, score, wu, team = t
	if team != team_id:
		continue
	if name in prev_members: # discard subsequent records of non-unique usernames
		continue
	prev_members[name] = int(score)

scores_and_names = []

for name in members:
	if name in prev_members:
		score = members[name] - prev_members[name]
		if score:
			scores_and_names.append((score, name))

del members, prev_members
scores_and_names.sort(reverse=True)

print()
print(" __Credit/week _ UserName")
r = 1
for s, n in scores_and_names:
	w = 18 if s > 99999999 else 13
	print(" {}_______".format(r)[0:9],
	      "{:,}____________".format(s)[0:w],
	      n, sep="")
	r +=1
The lines which start with '###' are leftovers from debugging the file parsing.
Post Reply