PrimeGrid
Please visit donation page to help the project cover running costs for this month
1) Message boards : Generalized Fermat Prime Search : Genefer-19 tasks invalid (Message 151559)
Posted 2 days ago by Profile Michael GoetzProject donor
Oops! Sorry. Let's try again.

<core_client_version>7.16.19</core_client_version>
<![CDATA[
<stderr_txt>
geneferocl 3.3.3-2 (Apple-x86/OpenCL/64-bit)

Copyright 2001-2018, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Michael Goetz, Ronald Schneider
Copyright 2011-2018, Iain Bethune
Genefer is free source code, under the MIT license.

Command line: geneferocl_macintel_3.3.3-2 -boinc -q 132841232^65536+1 --device 0

Normal priority change failed (needs superuser privileges.
Checking available transform implementations...
OCL transform is past its b limit.
OCL3 transform is past its b limit.
OCL4 transform is past its b limit.
OCL5 transform is past its b limit.
Using OCL2 transform
Running on platform 'Apple', device 'AMD Radeon Pro 5300 Compute Engine', vendor 'AMD', version 'OpenCL 1.2 ' and driver '1.2 (Aug 30 2021 06:56:17)'.
20 computeUnits @ 1650MHz, memSize=4080MB, cacheSize=0kB, cacheLineSize=0B, localMemSize=64kB, maxWorkGroupSize=256.
Starting initialization...
Initialization complete (0.067 seconds).
Testing 132841232^65536+1...
Estimated time for 132841232^65536+1 is 0:05:17
132841232^65536+1 is complete. (532371 digits) (err = 0.0000) (time = 0:05:14) 21:34:27
21:34:27 (27524): called boinc_finish

</stderr_txt>
]]>


The software didn't pick up any errors, so that means a calculation error occured, eventually producing the wrong result. It's a hardware error, most likely the GPU.
2) Message boards : Number crunching : o.k., I will bite. (Message 151558)
Posted 2 days ago by Profile Michael GoetzProject donor
Why does a GFN-22 unit have a longer average GPU time than a World Record unit,
and yet less credit is given? Also, I have a meager RTX3060, being held back by a
PCI 3.0 /8 bit bus and my times are almost 50% less on a GFN-22 and 40% less on a
World Record unit? Just curious. Is it the "b>=" value? The 3090 must do these
units quite fast on PCI 4.0 x 16 bit bus.


Because people are running the (slightly larger) DYFL tasks on faster GPUs. On the same hardware, DYFL will always take longer to run than GFN-22.
3) Message boards : Generalized Fermat Prime Search : Genefer-19 tasks invalid (Message 151543)
Posted 3 days ago by Profile Michael GoetzProject donor
geneferocl 3.3.3-2 (Windows/OpenCL/32-bit)
...
Running on platform 'NVIDIA CUDA', device 'NVIDIA GeForce GTX 1070 Ti',
...
This particular iMac does not have Nvidia H/W, at least I don't think it does.

What is it telling me?


It's telling you that you're either looking at the wrong task or the wrong computer, because that task ran on a Windows computer with an Nvidia 1070 Ti GPU. :)
4) Message boards : Generalized Fermat Prime Search : Genefer-19 tasks invalid (Message 151521)
Posted 6 days ago by Profile Michael GoetzProject donor
If you look in the stderr log on the respective task pages, you'll see that both invalid tasks had a "maxerr exceeded" error, followed by trying to recover by reverting to the last checkpoint. Furthermore, both tasks were running at the same point in time.

I suspect that your computer experienced some sort of fault or glitch which altered either the CPU state, cache, or main memory, affecting and corrupting both calculations. If this continues to happen, then you need to start diagnosing the computer hardware. If it doesn't happen again, I wouldn't worry about it too much, but I'd keep an eye out for future occurances.
5) Message boards : General discussion : Tasks returned in the last 24 hours (Message 151520)
Posted 6 days ago by Profile Michael GoetzProject donor
This does say "Tasks returned in the last 24 hours", maybe add a sub-part for the proof tasks? Personally when seeing this, I am interested in all my tasks that were returned. I know some days I am returning quite a few proof tasks and it would be nice to know the count.


The purpose of those statistics is to help people with their "firsts". If anything, it probably should be removed altogether for LLR2 projects, since it's not relevant.
6) Message boards : Wieferich and Wall-Sun-Sun Prime Search : Any idea on WU times CPU vs. GPU? (Message 151465)
Posted 9 days ago by Profile Michael GoetzProject donor
Job cache is too small for this project. I am using an RTX 3090 FE GPU and each work unit takes about 4 minutes and 45 seconds to complete. Ever thought about greatly increasing the job cash for us high end GPU owners?


What exactly do you mean by "job cache"? Most of the queue and buffer controls that affect job flow are actually set by the user rather than by the admins.

If your computer isn't requesting work, that's a setting on your side. You may need to look at the messages in your BOINC log to see whether or not your computer is requesting work when it contacts the server.

If it is requesting work and you're not getting any, unless you're trying to grab an ungodly number of tasks (which is a terrible strategy and you shouldn't be doing that), then the problem is likely that you have run out of wingmen and all the available tasks are from workunits where you've already run a task, and are thus ineligible to get another task. If this is the problem, there's little that can be done about it other than convincing other people to run WW or to diversify your task selection.
7) Message boards : Problems and Help : Not first in PPS-MEGA? (Message 151417)
Posted 13 days ago by Profile Michael GoetzProject donor
I see. But why does the task get abandoned and later recovered?


They are marked as abandoned because the server thinks the host computer is no longer attached to PrimeGrid. Usually this happens because the host changed ID numbers or more than one host is using the same name or ID number. I can't tell you exactly what is happening on your computers because the problem is on your computers where we don't have visibility.

The "recovered" part is easy to understand, however. Jim wrote a program that periodically scans the server for tasks that are in an error state but actually do have valid result files. WHen it finds one, it changes the state to "pending validation" to give you an opportunity to get credit. Normally, BOINC would just discard these tasks.
8) Message boards : Problems and Help : Not first in PPS-MEGA? (Message 151397)
Posted 15 days ago by Profile Michael GoetzProject donor
Also, for the task above, it seems like I reported earlier (5:36 UTC) than my opponent (5:49 UTC). Why was the opponent given the 1st?


The recovery of your task most likely happened after the other task was returned and marked as the canonical (aka "1st") task.

Because the other task was completed and marked as first before the server's cleanup process found and "unabandoned" your task.
9) Message boards : Number crunching : All tasks are aborted (Message 151301)
Posted 23 days ago by Profile Michael GoetzProject donor
Hello,

Just of curiosity, I checked recent CW task I crunched and found out that the host below keeps aborting all WUs it gets.

https://www.primegrid.com/show_host_detail.php?hostid=1084645


Why is that?[/code]


LLR2 will not run on windows XP.


This is true, however, it's not the reason that host is aborting those tasks.

If a Windows XP gets an LLR2 task, it will ERROR; it will *not* ABORT.

This host is aborting tasks. They're not erroring out.

Also, it's aborting SGS tasks too, and these are the old LLR, not LLR2, and should run just fine on that host.

I do not know if this is the same host that we discussed recently on Discord, but if it's a different host it's exhibiting the exact same behavior. While it's impossible to know for certain what the actual cause is, the most likely reason that fits the known facts is that the computer's clock is set incorrectly with a time in the future.

As soon as BOINC downloads a task, because the clock is set wrong, it thinks the task is already past its deadline. BOINC automatically aborts *unstarted* tasks that are past their deadline. Every task is therefore immediately aborted and returned to the server, which dutifully sends more tasks to the host. This continues until a daily limit is reached.

It doesn't hurt any other users, and it doesn't hurt the server, so you can safely ignore this host.
10) Message boards : Number crunching : Once in a Blue Moon Challenge (Message 151201)
Posted 32 days ago by Profile Michael GoetzProject donor
Why is the abortion-rate that high?
I mean, i too aborted around 10 %, to "realign" the WUs with some DC-tasks, but that high a percentage?
Any ideas?

There are many scenarios in which you download more tasks than you actually want to run and abort some of them – maybe by some logic. Some of them were outlined by others above, but I'ld like to point out one more.

There's a bug in BOINC where – if you don't have any tasks for some subproject in your client_state.xml or whatever and you start running that subproject, it will download one task per each core (as if you were going to run one task per core), independent of the settings of how many cores per task you've set for the subproject. So it's very possible to get way too many tasks downloaded at the start of the challenge like this and abort most of them.

Maybe this was fixed or alleviated by the changes with how you can set multi-threading from the server side preferences now, I wouldn't know for sure (haven't crunched in... years now).


Very likely the high abort rate is from DC task hunting. With the long running llrPSP tasks,
you may have a large amount of time left after running your last long task that will finish
before the challenge ends. So you can run DC tasks to use the remaining challenge time.
You do this by downloading 10 days of tasks, aborting all the long tasks, and keeping the
DC tasks to do after your last long task finishes


While that's indeed a plausible and reasonable explanation, the actual explanation is that it's mostly due to a single broken computer -- most likely with its system date set in the future -- that is continuously and automatically aborting all tasks as soon as they're received. It's certainly not intentional.


If you go to go to the Challenge leaderboard and look for users that have 5 times or more
tasks than their neighbors, they're probably DC task hunting, and aborting a lot of tasks.


It's certainly possible that people are doing this, but they're not the cause of the large number of aborted tasks. If this is happening, it's only causing a smaller portion of the aborts. By a significant margin, the largest source of the aborts is a single malfunctioning computer.

What you're suggesting isn't at all unreasonable. I suspected the same thing, which is part of the reason I looked into what was happening. But this wasn't, in fact, the cause of the aborts.


Next 10 posts
[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2021 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 2.79, 3.36, 3.51
Generated 20 Sep 2021 | 18:12:14 UTC