Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Problems and Help :
PPS sieve errors
Author |
Message |
|
Machine 944175 has twice failed to do a PPS Sieve. The only message I can see is:
58: 02-May-2020 09:16:46 (low) [PrimeGrid] Starting task pps_sr2sieve_131387705_1
59: 02-May-2020 09:16:48 (low) [PrimeGrid] Computation for task pps_sr2sieve_131387705_1 finished
60: 02-May-2020 09:16:48 (low) [PrimeGrid] Output file pps_sr2sieve_131387705_1_r164754821_0 for task pps_sr2sieve_131387705_1 absent
Any advice on what I could do? | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 858 ID: 301928 Credit: 503,832,307 RAC: 258,298
                        
|
Unfortunately the program immediately crashed without any output or clues.
process got signal 11</message>
SIGSEGV: segmentation violation
| |
|
|
Is there any way I could see this for myself? | |
|
|
Would running in standalone help?
____________
SHSIDElectronicsGroup@outlook.com
waiting for a TdP prime...
Proth "SoB": 44243*2^440969+1
| |
|
|
Would running in standalone help?
How?
| |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 858 ID: 301928 Credit: 503,832,307 RAC: 258,298
                        
|
Is there any way I could see this for myself?
On the site, go to "Your account" -> "Tasks" and click on the number of task in question. The "stderr output" field contain all information generated by task.
To run task manually, go to Boinc data directory (usually /var/lib/boinc, but may depend on distro). There is a subdirectory with same name as project (i.e. www.primegrid.com) where all programs are stored. Run affected program manually and see will it print anything (like usage help) or crashes immediately. At least we will know is the program crashes always (probably to something unexpected or incompatible on this system) or it happens later when real calculations are started.
| |
|
|
I cannot see anything that looks like a program I should run.
total 129712
-rwxr-xr-x 1 boinc boinc 303 Dec 25 2018 llr.ini.6.07
-rwxr-xr-x 1 boinc boinc 4182656 Dec 25 2018 primegrid_llr_wrapper_8.01_x86_64-pc-linux-gnu
-rwxr-xr-x 1 boinc boinc 37898152 Dec 25 2018 sllr64.3.8.21
-rwxr-xr-x 1 boinc boinc 2130168 Dec 30 2018 primegrid_gcw_sieve_1.00_x86_64-pc-linux-gnu
-rw-r--r-- 1 boinc boinc 1274417 Dec 30 2018 gc13_20171214.sieveinput
-rw-r--r-- 1 boinc boinc 1167163 Dec 30 2018 gc25_20171214.sieveinput
-rw-r--r-- 1 boinc boinc 844607 Dec 30 2018 gc49_20171214.sieveinput
-rw-r--r-- 1 boinc boinc 1098447 Dec 30 2018 gc69_20171214.sieveinput
-rw-r--r-- 1 boinc boinc 726231 Dec 30 2018 gc47_20171214.sieveinput
-rw-r--r-- 1 boinc boinc 974663 Jan 2 2019 gc55_20171214.sieveinput
-rw-r--r-- 1 boinc boinc 669791 Jan 10 2019 gc47_20190109.sieveinput
-rw-r--r-- 1 boinc boinc 1178397 Jan 10 2019 gc13_20190109.sieveinput
-rw-r--r-- 1 boinc boinc 1010935 Jan 10 2019 gc69_20190109.sieveinput
-rw-r--r-- 1 boinc boinc 896927 Jan 10 2019 gc55_20190109.sieveinput
-rw-r--r-- 1 boinc boinc 1040820 Jan 10 2019 gc25_20190109.sieveinput
-rw-r--r-- 1 boinc boinc 752159 Jan 10 2019 gc49_20190109.sieveinput
-rwxr-xr-x 1 boinc boinc 4112896 Jan 16 2019 primegrid_genefer_3_3_4_3.20_x86_64-pc-linux-gnu__cpuGFN17MEGA
-rwxr-xr-x 1 boinc boinc 4112896 Apr 22 2019 primegrid_genefer_3_3_4_3.20_x86_64-pc-linux-gnu__cpuGFN19
-rw-r--r-- 1 boinc boinc 6911635 May 4 2019 321_20190301.sieveinput
-rwxr-xr-x 1 boinc boinc 4182656 May 9 2019 llr_wrapper_8.00_x86_64-pc-linux-gnu
-rwxr-xr-x 1 boinc boinc 4112896 May 9 2019 genefer_linux64_3.3.4
-rwxr-xr-x 1 boinc boinc 972208 May 12 2019 tpsieve_0.3.10d_linux64
-rwxr-xr-x 1 boinc boinc 4999088 May 23 2019 ap27_2.6_cpu_linux64
-rwxr-xr-x 1 boinc boinc 38281976 May 30 2019 sllr64.3.8.23
-rwxr-xr-x 1 boinc boinc 71016 Aug 6 2019 sr2sieve64_1.8.2_linux
-rwxr-xr-x 1 boinc boinc 448160 Aug 6 2019 sr2sieve-wrapper64_2.00_linux
-rwxr-xr-x 1 boinc boinc 494176 Sep 4 2019 llr_wrapper_8.04_x86_64-pc-linux-gnu
-rw-r--r-- 1 boinc boinc 2460 Nov 6 20:32 stat_primegrid.png
-rw-r--r-- 1 boinc boinc 90521 Nov 6 20:32 primegrid_slideshow_00.png
-rw-r--r-- 1 boinc boinc 8077886 Apr 29 09:33 321_20200408.sieveinput
-rw-r--r-- 1 boinc boinc 55 May 2 16:23 321_sr2sieve_9351912_cmd
-rw-r--r-- 1 boinc boinc 69 May 2 16:26 stat_icon
-rw-r--r-- 1 boinc boinc 77 May 2 16:26 slideshow_psp_sr2sieve_00
-rw-r--r-- 1 boinc boinc 77 May 2 16:26 slideshow_primegen_00
-rw-r--r-- 1 boinc boinc 77 May 2 16:26 slideshow_llrWOO_00
-rw-r--r-- 1 boinc boinc 77 May 2 16:26 slideshow_llrTPS_00
-rw-r--r-- 1 boinc boinc 77 May 2 16:26 slideshow_llrCUL_00
-rw-r--r-- 1 boinc boinc 77 May 2 16:26 slideshow_gcwsieve_00
~ | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 858 ID: 301928 Credit: 503,832,307 RAC: 258,298
                        
|
I cannot see anything that looks like a program I should run.
This one:
-rwxr-xr-x 1 boinc boinc 972208 May 12 2019 tpsieve_0.3.10d_linux64
run as:
./tpsieve_0.3.10d_linux64 -h
It should print help.
| |
|
|
I cannot see anything that looks like a program I should run.
This one:
-rwxr-xr-x 1 boinc boinc 972208 May 12 2019 tpsieve_0.3.10d_linux64
run as:
./tpsieve_0.3.10d_linux64 -h
It should print help.
I managed to get an strace output:
execve("./tpsieve_0.3.10d_linux64", ["./tpsieve_0.3.10d_linux64", "-h"], 0x7ffe48d07638 /* 13 vars */) = 0
uname({sysname="Linux", nodename="glenthorne", ...}) = 0
brk(NULL) = 0xf77000
brk(0xf77f30) = 0xf77f30
arch_prctl(ARCH_SET_FS, 0xf77870) = 0
set_tid_address(0xf77900) = 4285
set_robust_list(0xf77910, 24) = 0
futex(0x7ffd81445f2c, FUTEX_WAKE_PRIVATE, 1) = 0
rt_sigaction(SIGRTMIN, {sa_handler=0x430070, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x42edb0}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {sa_handler=0x42ffb0, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x42edb0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
open("/dev/urandom", O_RDONLY) = 3
read(3, "\356@\326\244\316v\36", 7) = 7
close(3) = 0
brk(0xf98f30) = 0xf98f30
brk(0xf99000) = 0xf99000
close(2) = 0
open("stderr.txt", O_WRONLY|O_CREAT|O_APPEND, 0666) = 2
fstat(2, {st_mode=S_IFREG|0644, st_size=128, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f7ca1778000
fstat(2, {st_mode=S_IFREG|0644, st_size=128, ...}) = 0
lseek(2, 128, SEEK_SET) = 128
munmap(0x7f7ca1778000, 4096) = 0
rt_sigaction(SIGILL, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGILL, {sa_handler=0x40c3fa, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x42edb0}, NULL, 8) = 0
rt_sigaction(SIGABRT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGABRT, {sa_handler=0x40c3fa, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x42edb0}, NULL, 8) = 0
rt_sigaction(SIGBUS, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGBUS, {sa_handler=0x40c3fa, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x42edb0}, NULL, 8) = 0
rt_sigaction(SIGSEGV, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGSEGV, {sa_handler=0x40c3fa, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x42edb0}, NULL, 8) = 0
rt_sigaction(SIGSYS, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGSYS, {sa_handler=0x40c3fa, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x42edb0}, NULL, 8) = 0
rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=0x40c3fa, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x42edb0}, NULL, 8) = 0
open("init_data.xml", O_RDONLY) = -1 ENOENT (No such file or directory)
open("boinc_lockfile", O_WRONLY|O_CREAT, 0664) = 3
fcntl(3, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0
stat("init_data.xml", 0x7ffd814456b0) = -1 ENOENT (No such file or directory)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xffffffffff600400} ---
write(2, "SIGSEGV: segmentation violation\n", 32) = 32
futex(0x7059a0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xffffffffff600400} ---
+++ killed by SIGSEGV +++
I don't know but maybe this line is significant:
stat("init_data.xml", 0x7ffd814456b0) = -1 ENOENT (No such file or directory)
I checked on another machine. That has a valid init_data.xml. So I do believe that is the issue. How do I fix this and should the code be improved? | |
|
|
Try resetting the project.
____________
Werinbert is not prime... or PRPnet keeps telling me so.
Badge score: 1x1 + 12x3 + 1x4 + 1x5 + 1x6 + 2x7 + 1x8 + 1x10 = 84 | |
|
streamVolunteer moderator Project administrator Volunteer developer Volunteer tester Send message
Joined: 1 Mar 14 Posts: 858 ID: 301928 Credit: 503,832,307 RAC: 258,298
                        
|
I don't know but maybe this line is significant:
stat("init_data.xml", 0x7ffd814456b0) = -1 ENOENT (No such file or directory)
I checked on another machine. That has a valid init_data.xml. So I do believe that is the issue. How do I fix this and should the code be improved?
It's normal. "init_data.xml" is created by Boinc client when task is run under his control. If this file not exist (it happens when you running the program manually) - it's OK, application will write "running in standalone mode" to log file and continue to work. I've tested it on few different systems and was able, at least, get usage help from the app.
Alas, still out of ideas. If resetting project will not help, do not run CPU version of PPS Sieve, use GPU. GPU programs are also much faster and will give you more credit.
| |
|
|
Try resetting the project.
I will do this after have drained the current queue. That will take me well over a week.
| |
|
Dave  Send message
Joined: 13 Feb 12 Posts: 2861 ID: 130544 Credit: 1,016,086,849 RAC: 1,080,847
                      
|
Just abort them now then they will get recycled now. Quicker for everyone. | |
|
|
Just abort them now then they will get recycled now. Quicker for everyone.
It is my machine. Please explain how my management of my computers impacts anyone else. | |
|
robish Volunteer moderator Volunteer tester
 Send message
Joined: 7 Jan 12 Posts: 1855 ID: 126266 Credit: 5,277,336,325 RAC: 3,107,232
                           
|
If your tasks are failing anyway, it makes sense just to abort the remainder and try to fix the problem.
Aborting tasks is OK.
Fixing the problem now will save you a weeks crunching and allow you to earn more credit and do tasks correctly.
Either way it wont matter to anyone but you. So your choice, but I think as Dave said, aborting them would be the best approach. They will be recycled immediately and issued again.
____________
My lucky numbers 10590941048576+1 and 224584605939537911+81292139*23#*n for n=0..26 | |
|
|
If your tasks are failing anyway, it makes sense just to abort the remainder and try to fix the problem.
Aborting tasks is OK.
Fixing the problem now will save you a weeks crunching and allow you to earn more credit and do tasks correctly.
Either way it wont matter to anyone but you. So your choice, but I think as Dave said, aborting them would be the best approach. They will be recycled immediately and issued again.
None of my tasks are currently failing. Only PPS Sieve tasks are failing on one machine. That machine is not currently subscribed to PPS Sieve. I can almost certainly drain the queue without any more tasks failing and if I am wrong about that I could adapt.
In other words I would really appreciate an apology from someone at this point. | |
|
robish Volunteer moderator Volunteer tester
 Send message
Joined: 7 Jan 12 Posts: 1855 ID: 126266 Credit: 5,277,336,325 RAC: 3,107,232
                           
|
An apology for what exactly? You sought help and we tried to help.
____________
My lucky numbers 10590941048576+1 and 224584605939537911+81292139*23#*n for n=0..26 | |
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2222 ID: 1178 Credit: 9,228,166,781 RAC: 3,207,196
                                        
|
None of my tasks are currently failing. Only PPS Sieve tasks are failing on one machine. That machine is not currently subscribed to PPS Sieve. I can almost certainly drain the queue without any more tasks failing and if I am wrong about that I could adapt.
In other words I would really appreciate an apology from someone at this point.
There is obviously some miscommunication going on here. The other users in the thread were suggestion that you abort any PPS sieve tasks that you had running still on that machine rather than waste your CPU time on them until you were able to fix the errors...such users cannot see what you have currently running (that is visible only to admin). They were and remain unaware that you are running other PG tasks that may take some time on that CPU to complete before you would be able to do the project reset to see if PPS sieve then works or not.
| |
|
|
None of my tasks are currently failing. Only PPS Sieve tasks are failing on one machine. That machine is not currently subscribed to PPS Sieve. I can almost certainly drain the queue without any more tasks failing and if I am wrong about that I could adapt.
In other words I would really appreciate an apology from someone at this point.
There is obviously some miscommunication going on here. The other users in the thread were suggestion that you abort any PPS sieve tasks that you had running still on that machine rather than waste your CPU time on them until you were able to fix the errors...such users cannot see what you have currently running (that is visible only to admin). They were and remain unaware that you are running other PG tasks that may take some time on that CPU to complete before you would be able to do the project reset to see if PPS sieve then works or not.
Thank you | |
|
Post to thread
Message boards :
Problems and Help :
PPS sieve errors |