Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Seventeen or Bust :
3 months to validate?
Author |
Message |
|
This WU was completed 1. june, one day before deadline. I'm still vaiting for it to be validated. Is it normal for a WU from Seventeen or Bust v5.11 to take this long to be validated?
Name llr_sob_47465285_0
Workunit 116691876
Created 4 May 2010 20:48:11 UTC
Sent 5 May 2010 12:30:09 UTC
Received 1 Jun 2010 23:54:03 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 124979
Report deadline 2 Jun 2010 12:30:09 UTC
Run time 866,718.83
CPU time 719,501.86
Validate state Initial
Credit 0.00
Application version Seventeen or Bust v5.11 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,382,143 RAC: 302,741
                               
|
It takes two matching results to validate. Given the long run times of SoB, the long deadlines, and the difficulty some computers have running long LLR tasks, unfortunately it sometimes takes a really long time for another computer to come up with a matching result. Eventually, it will get validated.
I've got one that's almost as old as yours, having started at the beginning of June, and then going through a long string of wingmen who either had errors or just timed out. The first wingman was running SoB on a netbook with an Atom CPU!!! I don't think the wingman running my SoB WU is going to successfully finish it either, so maybe the next one will -- sometime in October, if I'm lucky.
SoB isn't a project with instant gratification! :)
____________
My lucky number is 75898524288+1 | |
|
|
Hi Roger,
Yes it can take some time to get your WU granted credit. Be patient, I'm also waiting for mine, finished on the 17th of July. Think you're lucky because your running Bust v5.11 as most of the systems do. So if the second system is running the same program, you'll be granted credit. If 2 other systems are going to run Bust v6.05, you won't get any. See the "Bust 5.11 versus 6.05" post just below your post.
Don' get annoyed, the app is "just" running and for sure they will straighten things out.
Happy crunching, think positive and don't give up. There not as many systems running the app as SETI.
Berend | |
|
|
Thanx for the rapid responses. :)
I was starting to worry things might have gotten screwed up.
Good to know everything is ok. | |
|
Vato Volunteer tester
 Send message
Joined: 2 Feb 08 Posts: 861 ID: 18447 Credit: 873,730,020 RAC: 1,369,334
                           
|
Think you're lucky because your running Bust v5.11 as most of the systems do. So if the second system is running the same program, you'll be granted credit. If 2 other systems are going to run Bust v6.05, you won't get any. See the "Bust 5.11 versus 6.05" post just below your post.
This is NOT accurate.
The different app versions reflect the different wrappers between windows and linux, but they are compatible. If you return a completed WU with a correct residue, you WILL get credit.
____________
| |
|
|
Well Vato,
We'll find out. This WU is waiting for credit. Being Dutch, the outcome is in Dutch and you can see that the outcome (Uitkomst) is "Geslaagd"- succes with no errors. You can also see that I'm running Linux (Bust v6.05). The system crunching it this moment is a Windows-box (v5.11). I am almost sure that when this WU is crunched by this box, my status will become: "Waiting for check, outcome inconclusive" or something like that in English. When a third box will crunch the WU and it is a Windows-box, I will not get any points. On the contrary when the third box is running Linux, the Windows-box will not get any points.
I crunched 3 SOB WU's and that was the outcome on the first 2 - one got credit (two v6.05 apps versus one v5.11) one didn't (two v5.11 apps versus my v6.05 app).
I'll come back on this when the WU below is or is not granted credit.
Greetings from the Low Lands,
Berend
Naam llr_sob_47466438_0
Werkeenheid 121831790
Aangemaakt 22 Jun 2010 4:34:15 UTC
Verzonden 23 Jun 2010 15:56:27 UTC
Ontvangen 17 Jul 2010 17:48:55 UTC
Server status Binnen
Uitkomst Geslaagd
Client status Gereed
Afsluit status 0 (0x0)
Computer ID 141904
Rapporteren voor 28 Jul 2010 15:56:27 UTC
Loop tijd 1,312,135.67
CPU tijd 1,197,453.48
Validatie status Initieel
Punten 0.00
Programma versie Seventeen or Bust v6.05
Stderr output
[Return to PrimeGrid main page]
Copyright © 2005 - 2010 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 1.82, 1.76, 1.48
Generated 12 Sep 2010 9:36:41 UTC | |
|
RytisVolunteer moderator Project administrator
 Send message
Joined: 22 Jun 05 Posts: 2653 ID: 1 Credit: 109,647,663 RAC: 47,581
                     
|
Let me just give you a few examples where 6.05 validated against 5.11 just fine:
http://www.primegrid.com/workunit.php?wuid=104266214
http://www.primegrid.com/workunit.php?wuid=109185700
http://www.primegrid.com/workunit.php?wuid=109051021 (in fact, this result also shows that 6.05 vs 6.05 can be deemed inconclusive, whereas 5.11 would validate against 6.05).
So I claim that there is no validation issue between 6.05 vs 5.11.
____________
| |
|
|
OK,
So if I look at the last WU, which was validated between a v5.11 and a 6.05 app, the system running for 2.2 mil secs. did this calculation just for fun. I don't think the operator will like it because he had the same outcome as I had: succes, etc., etc. So there must be some kind of discrepancy between the 2 app's. I think it will be hard for people crunching SOB for the points will be left with none in the end. I don't really care about the points but I'm not running SOB when there is a chance that 2 mil of seconds of calculating are just to find out that a succesful run is granted zero points. Then I'd rather run WGC or another app like LLR.
Still no hard feelings but I think one has to look of what's going on. As I said, I'll come back on the subject at the time the last SOB WU has been succeded, wheather no points or points are granted.
Keep on crunching,
Berend
| |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,382,143 RAC: 302,741
                               
|
OK,
So if I look at the last WU, which was validated between a v5.11 and a 6.05 app, the system running for 2.2 mil secs. did this calculation just for fun. I don't think the operator will like it because he had the same outcome as I had: succes, etc., etc. So there must be some kind of discrepancy between the 2 app's.
ANY long WU (here or anywhere else) can be frustrating if it doesn't complete for whatever reason. It's just as frustrating, or even more frustrating, to have a CPDN WU fail at 95% after two months of crunching. There's no validation to worry about, and you get intermediate credit for trickles, but it's still very, very frustrating. I've had that happen more times than I'd like.
But your conclusion, based upon the data, is only one of several possible conclusions. Of those possible conclusions, the available data seems to point to one of those more than the other.
You have, with SoB, many instances of WUs failing to complete with "Computation Errors", many instances of users aborting the WUs, and fewer, but still a lot, of instances where completed WUs don't validate against each other.
As you point out there are two different version of the application.
There are two obvious possibilities:
1) There's a discrepancy between the two versions of the apps causing them to not validate against each other.
2) There's a problem (or problems) not related to the application version that is causing results not to validate.
As can be easily shown by looking at a variety of WUs, there seems be no correlation between the app versions and whether a result validates correctly:
* 6.05 sometimes validates against 6.05, and sometimes does not
* 5.11 sometimes validates against 5.11, and sometimes does not
* 5.11 sometimes validates against 6.05, and sometimes does not
If there was a discrepancy in the two versions, you would see 5.11 always not validating against 6.05, or at least a higher frequency of invalid results. The data, however, simply does not show that to be happening. There's no difference, as far as I can see, in the success rate of 5.11, 6.05, or 5.11 vs. 6.05.
Whatever the problem is, the data seems to preclude it being a difference in the software versions.
So what is the problem? It *seems* to be a problem with particular computers having trouble running LLR, especially for extended periods of time. It's analogous to running your car for two weeks at 120 MPH -- you might run into problems you wouldn't see in normal usage. In the case of LLR, it tends to bring out the worst in CPUs, for whatever reason. Some computers, which normally are stable, just don't run it well. Is it because the hardware is at fault under those conditions? Is it a flaw in the software? I can't say for sure, although I have my suspicions.
But whatever the problem is, it's almost certainly not a problem with some difference between 5.11 and 6.05.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,382,143 RAC: 302,741
                               
|
I need to point out that I've run a bunch of PSP and SoB WUs, and have run into the exact same problem.
I completed my WU, and then waited -- and waited and waited -- for someone else to complete it too. Someone finally did, and, as with you, the results did not validate. Same as you, the versions were different. One was Windows, one was Linux. (I don't remember which I was running since I run both.)
The next computer was running the same version as the other guy -- so I was the one with the not-matching version. Like you, I was a bit worried that there might be a discrepancy in the versions that might cause my perfectly good work to go to waste.
But then I looked at the computer that didn't validate against mine. You know what? It had several WUs that failed validation. So I was fairly confident that the third computer's result would validate against mine.
Sure enough, when it finally finished, that third computer validated against mine, even though it was running the same version as the second (non-valid) computer.
That's why I'm so sure this has nothing to do with the versions.
____________
My lucky number is 75898524288+1 | |
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 14037 ID: 53948 Credit: 477,382,143 RAC: 302,741
                               
|
More data is always better than less data so I compiled a list of completed results from all my LLR WUs still in the database. 5.11 & 5.13 are Windows, 6.03 and 6.05 are Linux, and 1.05 is Mac. Only WU's with a valid canonical result are shown. The only SoB task I have left only has one completed result, but PSP is identical except for the values being tested.
Not shown are any results that did not complete processing, including results that timed out, were aborted, or had computation errors during processing.
Results are listed in the order they were returned.
PSP:
127175186 Valid: 5.11, 6.03
127167482 Valid: 5.11, 5.11
127163647 Valid: 5.11, INVALD: 6.03, Valid: 6.03, 6.03
126992980 Valid: 5.11, 5.11, 5.11
126958608 Valid: 5.11, 5.11
124825384 Valid: 5.11, 6.03
124660653 Valid: 5.11, 6.03, 5.11
The Riesel problem is identical to SoB/PSP except that the form of the candidate is k*2^n-1 rather than k*2^n+1.
TRP:
130682090 Valid: 5.11, 5.11
130595394 Valid: 5.11, 5.11
130199819 Valid: 5.11, 5.11
130610698 Valid: 5.11, 5.11
130471520 Valid: 5.11, 5.11
130071455 Valid: 6.05, 5.11
130071463 Valid: 5.11
130096104 Valid: 5.11, 6.05, 1.05
Normally, there's only 2 valid results per WU, but you can get more than two if a result times out, a new result is assigned to another computer, and then the timed-out result gets returned.
Usually, it takes at least two valid results for an LLR WU to be completed, but there are at least two sscenarios I'm aware of that can validate with only a single result. The first (which is what happened with 130071463), is when PrimeGrid is doublechecking a primality test done elsewhere. The second scenario is where Primegrid is selectively requiring only a single valid result when sent to normally reliable hosts.
This is the raw data; feel free to come to any conclusions you wish about what it means. :)
(Note: There's a preponderance of 5.11's because recently I've been running only Windows, so by default at least 50% of the results shown will be 5.11.)
____________
My lucky number is 75898524288+1 | |
|
|
OK Michael,
I read the whole lot but still I find it odd. I had no problems running LLR. All of them validated. I'm just curious why these things happen. The only project in which this happened to me was WGC and it only happened once. But the runtime in that project isn't that long. In all other projects this never occured.
As I said I will only come back on the matter when the last WU is crunched.
Again, it's just curiosity. The last WU will probably be validated somewhere around october or november.
Keep on crunching and wishing you all the best
Berend | |
|
Message boards :
Seventeen or Bust :
3 months to validate? |