PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise

Advanced search

Message boards : Generalized Fermat Prime Search : Multi-threaded GFN-19 3.4.0.2

Author Message
pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155268 - Posted: 30 Apr 2022 | 15:03:14 UTC

Merci pour cette nouvelle et printanière version de GFN-19, multi-threadée !

https://www.primegrid.com/result.php?resultid=1339100345

The load average is 12 for 12 threads on 12 cores. Perfect scalability. Nice piece of work.

But there is a drawback with this "3.4.0.2" version. Be aware that when a GFN-19 CPU and a GFN-19 GPU run simultaneously, both are awfully slow since conflicting: it seems that the GPU thread perturbs the CPU threads (memory access, granularity, ...). Therefore the GFN-19 GPU unit could not be run simultaneously.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 712
ID: 164101
Credit: 305,166,630
RAC: 0
GFN Double Silver: Earned 200,000,000 credits (305,166,630)
Message 155269 - Posted: 30 Apr 2022 | 19:09:27 UTC - in response to Message 155268.
Last modified: 30 Apr 2022 | 19:10:05 UTC

Thank you for the announcement and en français !

One core should be available for the GPU driver. It creates a large stack of commands and sends them to the GPU. I don't know its size but it should much smaller than the Ryzen 9 5900X L3 cache.

The data size for one GFN-19 task is 10 MB. The organization of Ryzen 9 cache is two CCX: 2x32 MB. Then at least two tasks must run concurrently. What is the best throughput?
Max # of threads for each task =
- 5: 2 tasks => 10 cores, 20 MB.
- 3 (and 90% CPU): 3 tasks => 9 cores, 30 MB.
- 2 (and 90% CPU): 5 tasks => 10 cores, 50 MB.

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155271 - Posted: 1 May 2022 | 4:51:30 UTC - in response to Message 155269.
Last modified: 1 May 2022 | 5:39:46 UTC

Yves,
Une annonce officielle est toujours la bienvenue après une période de test.
Avec mes sincéres remerciements et une remarque pour alimenter la discussion.
Bien amicalement, Pascal.

For the optimal use of caches proposed by Yves, threads of the same task have to be bound (pinned) to the same CCX.
Could the multi-threaded GFN-19 version automatically set the affinities (of OpenMP threads ?) with respect to the architecture ?

For a proper use of dedicated L1, L2 and L3 caches, threads have to be bound (pinned) appropriately to each CCX.
This further optimization mostly depends on the use of the L3 cache by the GFN-19 application.

The following example describes how to (manually) set the affinities (using sudo htop shortcut a).
The topology is checked using numactl and do NOT separate the two CCX (as it would for 2 processors).
The setup of affinities is verified using lstopo --ps --top.

Process 7735 is bound (pinned) to cores (0, 1, 2, 3, 4, 5) and process 7737 to (6, 7, 8, 9, 10, 11).
Each thread of process 7735 is bound with one core of CCX 1 : 7736 is bound to core 0, 7739 to 1, ...
Each thread of process 7737 is bound with one core of CCX 2: 7738 is bound to core 6, 7744 to 7, ...

    pascaltec@valtin:~$ lstopo --ps --top --of console
    Machine (16GB total)
    Package L#0
    NUMANode L#0 (P#0 16GB)

    L3 L#0 (32MB)
    L2 L#0 (512KB)
    L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
    Misc(Process) 7735 genefer_linux64 7736
    L2 L#1 (512KB)
    L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
    Misc(Process) 7735 genefer_linux64 7739
    L2 L#2 (512KB)
    L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
    Misc(Process) 7735 genefer_linux64 7740
    L2 L#3 (512KB)
    L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    Misc(Process) 7735 genefer_linux64 7741
    L2 L#4 (512KB)
    L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
    Misc(Process) 7735 genefer_linux64 7742
    L2 L#5 (512KB)
    L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
    Misc(Process) 7735 genefer_linux64 7743
    Misc(Process) 7735 genefer_linux64

    L3 L#1 (32MB)
    L2 L#6 (512KB)
    L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
    Misc(Process) 7737 genefer_linux64 7738
    L2 L#7 (512KB)
    L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
    Misc(Process) 7737 genefer_linux64 7744
    L2 L#8 (512KB)
    L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
    Misc(Process) 7737 genefer_linux64 7745
    L2 L#9 (512KB)
    L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
    Misc(Process) 7737 genefer_linux64 7746
    L2 L#10 (512KB)
    L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
    Misc(Process) 7737 genefer_linux64 7747
    L2 L#11 (512KB)
    L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)
    Misc(Process) 7737 genefer_linux64 7748
    Misc(Process) 7737 genefer_linux64


As previously observed and indicated by Yves, two concurrent tasks are more efficient on one CCX each.
Once again the scalability is perfect: the load average remains to 12 on 12 cores, 2 tasks of 6 cores.
https://www.primegrid.com/workunit.php?wuid=791907057
https://www.primegrid.com/workunit.php?wuid=791891084
Estimated runtime : (2 tasks, 6 cores each) 1 hour 57 minutes << 2x 1 hour 22 minutes 30 seconds (1 task, 12 cores).
https://www.primegrid.com/workunit.php?wuid=791848937
https://www.primegrid.com/workunit.php?wuid=791840535
No statistics yet ...

pascaltec@valtin:~$ sensors; lscpu -e

k10temp-pci-00c3
Adapter: PCI adapter
Tctl: +67.9°C
Tccd1: +65.5°C
Tccd2: +62.2°C

CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ MHZ
0 0 0 0 0:0:0:0 yes 4950,1948 2200,0000 4036.961
1 0 0 1 1:1:1:0 yes 4950,1948 2200,0000 4037.055
2 0 0 2 2:2:2:0 yes 4950,1948 2200,0000 4037.133
3 0 0 3 3:3:3:0 yes 4950,1948 2200,0000 4037.188
4 0 0 4 4:4:4:0 yes 4950,1948 2200,0000 4037.246
5 0 0 5 5:5:5:0 yes 4950,1948 2200,0000 4037.293
6 0 0 6 8:8:8:1 yes 4950,1948 2200,0000 4037.186
7 0 0 7 9:9:9:1 yes 4950,1948 2200,0000 4037.231
8 0 0 8 10:10:10:1 yes 4950,1948 2200,0000 4037.280
9 0 0 9 11:11:11:1 yes 4950,1948 2200,0000 4037.340
10 0 0 10 12:12:12:1 yes 4950,1948 2200,0000 4037.391
11 0 0 11 13:13:13:1 yes 4950,1948 2200,0000 4037.437

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 712
ID: 164101
Credit: 305,166,630
RAC: 0
GFN Double Silver: Earned 200,000,000 credits (305,166,630)
Message 155273 - Posted: 1 May 2022 | 10:11:43 UTC - in response to Message 155271.

Could the multi-threaded GFN-19 version automatically set the affinities (of OpenMP threads ?) with respect to the architecture ?

The main problem is that Boinc doesn't send to the app information about the number of apps operating concurrently and the index of the process.

But OpenMP environment variables can do the job on Linux (OpenMP affinity is not supported on Windows).

OMP_DISPLAY_ENV = true and OMP_DISPLAY_AFFINITY = true are useful.

OMP_PLACES = cores should be sufficient because the placement of the threads is ordered: thread #0 => core #0, thread #1 => core #1, etc then for 2 tasks of 6 cores the first task should run on CCX #0 and the second one on CCX #1.

OMP_PLACES = ll_caches is optimized for complex L3 cache organization and OMP_PLACES = numa_domains for multiprocessor systems. I never tried them and I don't know why 'cores' is not sufficient.

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155277 - Posted: 2 May 2022 | 5:25:27 UTC
Last modified: 2 May 2022 | 5:40:24 UTC

Assuming that tasks require the same amount of operation, elapsed times WITHOUT setting affinities are 2 hours and 25 minutes.
This overhead of 28 minutes (~20%) is due to the fact that OpenMP threads do NOT use the same L3 cache (and transfer data from one CCX to the other).
https://www.primegrid.com/workunit.php?wuid=792102301
https://www.primegrid.com/workunit.php?wuid=792064239

GFN-19 multithreaded seems to be insensitive to OpenMP environment variable OMP_PLACES.
OMP_PROC_BIND is mandatory to activate the affinity policy: export OMP_PROC_BIND="true".

Inside the GFN-19 source code, is a call to omp_get_proc_bind() required ? OR
a change to the pragma directive : #pragma omp parallel proc_bind(master) ??
Reference: https://www.openmp.org/spec-html/5.0/openmpse14.html

The master thread affinity policy instructs the execution environment to assign every thread in the team to the same "place" as the master thread.
The "place" partition is not changed by this policy, and each implicit task inherits the place-partition-var ICV of the parent implicit task."

The place-partition-var is set by OMP_PLACES depending on the use of the processor architecture:
There are various ways for the user to set the policy (defined by OMP_PLACES in /etc/init.d/boinc ?)

For two CCX, 2 tasks of 6 threads each according to numa labelling : export OMP_PLACES="{0:5},{6,11}"
For two CCX, 4 tasks of 3 threads each according to numa labelling : export OMP_PLACES="{0:2},{3:5},{6:8},{9:11}"
Last level caches would be indeed the most appropriate option ... if last level refers to L3 cache ! (not verified)
export OMP_PLACES="ll_caches"

(setup not validated yet).

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 712
ID: 164101
Credit: 305,166,630
RAC: 0
GFN Double Silver: Earned 200,000,000 credits (305,166,630)
Message 155278 - Posted: 2 May 2022 | 8:02:03 UTC - in response to Message 155277.

OMP_PROC_BIND is mandatory to activate the affinity policy: export OMP_PROC_BIND="true".

OMP_PROC_BIND is not needed because When undefined, OMP_PROC_BIND defaults to TRUE when OMP_PLACES or GOMP_CPU_AFFINITY is set and FALSE otherwise.

Inside the GFN-19 source code, is a call to omp_get_proc_bind() required?

This would overwrite OMP_PROC_BIND. Only the environment variable OMP_NUM_THREADS is overwritten.

For two CCX, 2 tasks of 6 threads each according to numa labelling : export OMP_PLACES="{0:5},{6,11}"

This is not correct, the syntax is <lower-bound>:<length>, then export OMP_PLACES="{0:6},{6:6}"

On my computer (4 cores): first OMP_DISPLAY_AFFINITY is set to display affinity info:

export OMP_DISPLAY_AFFINITY=true ./genefer_linux64 --nthreads 2 -q "5000000^524288+1" level 1 thread 0x27e7940 affinity 0-3 level 1 thread 0x7f6a5220a700 affinity 0-3

Affinity is "free": 0-3. Now with OMP_PLACES:

export OMP_PLACES="{0:2}" ./genefer_linux64 --nthreads 2 -q "5000000^524288+1" level 1 thread 0x1ac8940 affinity 0-1 level 1 thread 0x7fe6257bb700 affinity 0-1

export OMP_PLACES="{2:2}" ./genefer_linux64 --nthreads 2 -q "5000000^524288+1" level 1 thread 0x18f7940 affinity 2-3 level 1 thread 0x7ff1453b6700 affinity 2-3

There are various ways for the user to set the policy (defined by OMP_PLACES in /etc/init.d/boinc ?)

I don't know for boinc variables but since genefer is a child process of boinc, the environment variables must be defined before the execution of boinc (a system-wide environment variables in /etc/environment?).

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155280 - Posted: 2 May 2022 | 17:04:19 UTC
Last modified: 2 May 2022 | 17:52:16 UTC

Merci Yves pour les corrections et astuces !

A first quick answer:
You're right: The conclusion is that OMP_PLACES have to be different for each task.
(It is not a rule to place threads of multiple instances but the rule for the current instance).

This is usually handled by the job manager (slurm, a descendant of boinc) and transparent
to the user (until today !). The job manager reserves a range or a list of cores or threads or ...

The next question is:
How to automatically set OMP_PLACES at the start of the instance (without changing genefer)
depending on the numa architecture and/or a user defined policy? (It should work for windows).

Edit: just changing the command line ?
OMP_PLACES="{0}:6:1" ../genefer_linux64_3.4.0-2 --nthreads 6 -q "5000000^524288+1"

in /var/lib/boinc/slots/0$ more genefer_linux64_3.4.0-2
<soft_link>../../projects/www.primegrid.com/genefer_linux64_3.4.0-2</soft_link>


At runtime it is quite simple to apply a range of cores to the instance:
sudo taskset -p -c 0-5 <<pid1>> for the master thread of the first task and
sudo taskset -p -c 6-11 <<pid2>> for the master thread of the second one.

===== Console #1 : First task ==================================================

$ export OMP_PLACES="{0}:6:1"
$ ...
$ env | grep OMP
OMP_PLACES={0}:6:1
OMP_DISPLAY_ENV=true
OMP_DISPLAY_AFFINITY=true
OMP_NUM_THREADS=6

$ ../genefer_linux64_3.4.0-2 --nthreads 6 -q "5000000^524288+1"

OPENMP DISPLAY ENVIRONMENT BEGIN
_OPENMP = '201511'
OMP_DYNAMIC = 'FALSE'
OMP_NESTED = 'FALSE'
OMP_NUM_THREADS = '6'
OMP_SCHEDULE = 'DYNAMIC'
OMP_PROC_BIND = 'TRUE'
OMP_PLACES = '{0},{1},{2},{3},{4},{5}'
OMP_STACKSIZE = '0'
OMP_WAIT_POLICY = 'PASSIVE'
OMP_THREAD_LIMIT = '4294967295'
OMP_MAX_ACTIVE_LEVELS = '1'
OMP_CANCELLATION = 'FALSE'
OMP_DEFAULT_DEVICE = '0'
OMP_MAX_TASK_PRIORITY = '0'
OMP_DISPLAY_AFFINITY = 'TRUE'
OMP_AFFINITY_FORMAT = 'level %L thread %i affinity %A'
OMP_ALLOCATOR = 'omp_default_mem_alloc'
OMP_TARGET_OFFLOAD = 'DEFAULT'
OPENMP DISPLAY ENVIRONMENT END
genefer 3.4.0.2 (CPU/Linux/64-bit/gcc-11.1.0/BOINC-7.17.0)

Copyright 2001-2022, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Michael Goetz, Ronald Schneider
Copyright 2011-2018, Iain Bethune
Genefer is free source code, under the MIT license.

Supported transform implementations: fma avx sse4 sse2 fmai avxi sse4i sse2i

Command line: ../genefer_linux64_3.4.0-2 --nthreads 6 -q 5000000^524288+1

Low priority change succeeded.

Testing 5000000^524288+1...
Using fmai transform (6 threads)
Resuming 5000000^524288+1 from a checkpoint (11351958 iterations left)
level 1 thread 0x9bf940 affinity 0 (3998 master thread)
level 1 thread 0x7f63937ed700 affinity 1 (3999)
level 1 thread 0x7f6392fec700 affinity 2 (4000)
level 1 thread 0x7f63927eb700 affinity 3 (4001)
level 1 thread 0x7f6391fea700 affinity 4 (4002)
level 1 thread 0x7f63917e9700 affinity 5 (4003)

===== Console #2 : Second task =================================================

$ export OMP_PLACES="{6}:6:1"
$ ...
$ env | grep OMP
OMP_PLACES={6}:6:1
OMP_DISPLAY_ENV=true
OMP_DISPLAY_AFFINITY=true
OMP_NUM_THREADS=6

$ ../genefer_linux64_3.4.0-2 --nthreads 6 -q "5000000^524288+1"

OPENMP DISPLAY ENVIRONMENT BEGIN
_OPENMP = '201511'
OMP_DYNAMIC = 'FALSE'
OMP_NESTED = 'FALSE'
OMP_NUM_THREADS = '6'
OMP_SCHEDULE = 'DYNAMIC'
OMP_PROC_BIND = 'TRUE'
OMP_PLACES = '{6},{7},{8},{9},{10},{11}'
OMP_STACKSIZE = '0'
OMP_WAIT_POLICY = 'PASSIVE'
OMP_THREAD_LIMIT = '4294967295'
OMP_MAX_ACTIVE_LEVELS = '1'
OMP_CANCELLATION = 'FALSE'
OMP_DEFAULT_DEVICE = '0'
OMP_MAX_TASK_PRIORITY = '0'
OMP_DISPLAY_AFFINITY = 'TRUE'
OMP_AFFINITY_FORMAT = 'level %L thread %i affinity %A'
OMP_ALLOCATOR = 'omp_default_mem_alloc'
OMP_TARGET_OFFLOAD = 'DEFAULT'
OPENMP DISPLAY ENVIRONMENT END
genefer 3.4.0.2 (CPU/Linux/64-bit/gcc-11.1.0/BOINC-7.17.0)

Copyright 2001-2022, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2014, Michael Goetz, Ronald Schneider
Copyright 2011-2018, Iain Bethune
Genefer is free source code, under the MIT license.

Supported transform implementations: fma avx sse4 sse2 fmai avxi sse4i sse2i

Command line: ../genefer_linux64_3.4.0-2 --nthreads 6 -q 5000000^524288+1

Low priority change succeeded.

Testing 5000000^524288+1...
Using fmai transform (6 threads)
Resuming 5000000^524288+1 from a checkpoint (11347503 iterations left)
level 1 thread 0x1bcc940 affinity 6 (4006 master thread)
level 1 thread 0x7f8bdc8fa700 affinity 7 (4007)
level 1 thread 0x7f8bdc0f9700 affinity 8 (4008)
level 1 thread 0x7f8bdb8f8700 affinity 9 (4009)
level 1 thread 0x7f8bdb0f7700 affinity 10 (4010)
level 1 thread 0x7f8bda8f6700 affinity 11 (4011)

===== Console #3 : Show the topology of the system and threads binding =========

$ lstopo --ps --top --of console
Machine (16GB total)
Package L#0
NUMANode L#0 (P#0 16GB)
L3 L#0 (32MB)
L2 L#0 (512KB)
L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
Misc(Process) 3998 genefer_linux64 3998 genefer_linux64
L2 L#1 (512KB)
L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
Misc(Process) 3998 genefer_linux64 3999 genefer_linux64
L2 L#2 (512KB)
L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
Misc(Process) 3998 genefer_linux64 4000 genefer_linux64
L2 L#3 (512KB)
L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
Misc(Process) 3998 genefer_linux64 4001 genefer_linux64
L2 L#4 (512KB)
L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
Misc(Process) 3998 genefer_linux64 4002 genefer_linux64
L2 L#5 (512KB)
L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
Misc(Process) 3998 genefer_linux64 4003 genefer_linux64
Misc(Process) 3998 genefer_linux64
L3 L#1 (32MB)
L2 L#6 (512KB)
L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
Misc(Process) 4006 genefer_linux64 4006 genefer_linux64
L2 L#7 (512KB)
L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
Misc(Process) 4006 genefer_linux64 4007 genefer_linux64
L2 L#8 (512KB)
L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
Misc(Process) 4006 genefer_linux64 4008 genefer_linux64
L2 L#9 (512KB)
L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
Misc(Process) 4006 genefer_linux64 4009 genefer_linux64
L2 L#10 (512KB)
L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
Misc(Process) 4006 genefer_linux64 4010 genefer_linux64
L2 L#11 (512KB)
L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)
Misc(Process) 4006 genefer_linux64 4011 genefer_linux64
Misc(Process) 4006 genefer_linux64

The main advantage of this setting {0}:6:1 is that the thread does NOT move from one core to the other in a range of threads {0:6:1} and keep the same L1i, L1d, L2 caches throughout the whole run (don't know if it is useful for genefer ?). The second advantage is that the childs (threads) of the master thread share the same L3 cache of one CCX (without wandering around). The third advantage is one instance use efficiently one CCX.

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155281 - Posted: 3 May 2022 | 6:14:44 UTC
Last modified: 3 May 2022 | 6:17:58 UTC

For various examples of hardware topology (google: lstopo).

The hardware locality is retrieved using lstopo (sudo apt install hwloc).
Non Uniform Memory Access is retrieved using numactl (sudo apt install numactl).

The scheduler has to count the number of sockets and L3 caches in order to define the subsets of cores
which share the same cache and/or the same fastest memory accesses, or the same socket on multiprocs, i.e. the same hardware locality.

The scheduler has to define a pool of these subsets following OMP_PLACES (like the GPU pool).
Just before launching a task, the scheduler tests to choose an available free subset and then place
the appropriate prefix at the beginning of the command line, that is the OMP_PLACES directive.

Should work ! Help is welcome to modify the boinc scheduling.

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155386 - Posted: 11 May 2022 | 5:37:09 UTC
Last modified: 11 May 2022 | 5:42:01 UTC

A simple workaround solution to set the affinity of genefer threads
The bash script pin-genefer-threads is scheduled to automatically run every minute in the crontab.

$ cat /etc/crontab

*/1 * * * * root pin-genefer-threads

$ cat pin-genefer-threads
#!/bin/bash processor=-1 pids=`ps -u boinc | grep genefer | awk '{print $1}'` for pid in $pids; do spids=`ps -Tu boinc | grep $pid | awk '{print $2}'` for spid in $spids; do if [[ $(($spid-$pid)) -ne 1 ]]; then processor=$(($processor+1)) fi echo $pid ': taskset -pc' $processor $spid taskset -pc $processor $spid done done ps -TFu boinc # That's all, folks !!!

genefer_linux64_3.4.0-3 generates one unused thread.
(this was not the case of the previous version).

$ ./pin-genefer-threads
14690 : taskset -pc 0 14690 pid 14690's current affinity list: 0-11 pid 14690's new affinity list: 0 14690 : taskset -pc 0 14691 pid 14691's current affinity list: 0-11 pid 14691's new affinity list: 0 14690 : taskset -pc 1 14694 pid 14694's current affinity list: 0-11 pid 14694's new affinity list: 1 14690 : taskset -pc 2 14695 pid 14695's current affinity list: 0-11 pid 14695's new affinity list: 2 14690 : taskset -pc 3 14696 pid 14696's current affinity list: 0-11 pid 14696's new affinity list: 3 14690 : taskset -pc 4 14697 pid 14697's current affinity list: 0-11 pid 14697's new affinity list: 4 14690 : taskset -pc 5 14698 pid 14698's current affinity list: 0-11 pid 14698's new affinity list: 5 14692 : taskset -pc 6 14692 pid 14692's current affinity list: 0-11 pid 14692's new affinity list: 6 14692 : taskset -pc 6 14693 pid 14693's current affinity list: 0-11 pid 14693's new affinity list: 6 14692 : taskset -pc 7 14699 pid 14699's current affinity list: 0-11 pid 14699's new affinity list: 7 14692 : taskset -pc 8 14700 pid 14700's current affinity list: 0-11 pid 14700's new affinity list: 8 14692 : taskset -pc 9 14701 pid 14701's current affinity list: 0-11 pid 14701's new affinity list: 9 14692 : taskset -pc 10 14702 pid 14702's current affinity list: 0-11 pid 14702's new affinity list: 10 14692 : taskset -pc 11 14703 pid 14703's current affinity list: 0-11 pid 14703's new affinity list: 11 UID PID SPID PPID C SZ RSS PSR STIME TTY TIME CMD boinc 12326 12326 1 0 46258 16868 6 06:49 ? 00:00:01 /usr/bin/boinc boinc 12326 12477 1 0 46258 16868 5 06:49 ? 00:00:00 /usr/bin/boinc boinc 14690 14690 12326 98 13142 6392 0 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495386^65536+1 --nthreads 6 boinc 14690 14691 12326 0 13142 6392 2 07:31 ? 00:00:00 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495386^65536+1 --nthreads 6 boinc 14690 14694 12326 98 13142 6392 1 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495386^65536+1 --nthreads 6 boinc 14690 14695 12326 98 13142 6392 2 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495386^65536+1 --nthreads 6 boinc 14690 14696 12326 98 13142 6392 3 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495386^65536+1 --nthreads 6 boinc 14690 14697 12326 98 13142 6392 4 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495386^65536+1 --nthreads 6 boinc 14690 14698 12326 98 13142 6392 5 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495386^65536+1 --nthreads 6 boinc 14692 14692 12326 98 13126 6324 6 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495530^65536+1 --nthreads 6 boinc 14692 14693 12326 0 13126 6324 5 07:31 ? 00:00:00 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495530^65536+1 --nthreads 6 boinc 14692 14699 12326 98 13126 6324 7 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495530^65536+1 --nthreads 6 boinc 14692 14700 12326 97 13126 6324 8 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495530^65536+1 --nthreads 6 boinc 14692 14701 12326 98 13126 6324 9 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495530^65536+1 --nthreads 6 boinc 14692 14702 12326 98 13126 6324 10 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495530^65536+1 --nthreads 6 boinc 14692 14703 12326 98 13126 6324 11 07:31 ? 00:00:27 ../../projects/www.primegrid.com/genefer_linux64_3.4.0-3 -boinc -q 166495530^65536+1 --nthreads 6

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155396 - Posted: 11 May 2022 | 16:42:42 UTC
Last modified: 11 May 2022 | 17:25:55 UTC

pin-genefer-threads

Setting properly the affinity of genefer OpenMP threads decreases the CPU time by 20%.

This script (task)set the affinity of genefer threads selecting the processors one after the other
regardless of the number of threads allocated to each genefer task (# of threads for each task).
The version of the script is NOT compatible with "leave tasks in memory while suspended = yes".

The script is placed in /var/lib/boinc, is executable and is owned by root.

-rwxr--r-- 1 root root 513 may 11 17:55 /var/lib/boinc/pin-genefer-threads*


The script is executed every minute by cron.
The output of the script is redirected to the journal ( | systemd-cat ).

$ cat /etc/crontab

*/1 * * * * root /var/lib/boinc/pin-genefer-threads | systemd-cat


The script issues a taskset command only if the affinity is not properly set.

$ cat pin-genefer-threads
#!/bin/bash processor=-1 pids=`ps -u boinc | grep genefer | awk '{print $1}'` for pid in $pids; do spids=`ps -Tu boinc | grep $pid | awk '{print $2}'` for spid in $spids; do if [[ $(($spid-$pid)) -ne 1 ]]; then processor=$(($processor+1)) fi taskget=`taskset -pc $spid | awk '{print $6}'` if [[ $taskget -ne $processor ]]; then echo 'boinc genefer process' $pid ': taskset -pc' $processor $spid taskset -pc $processor $spid fi done done # That's all, folks !!!


An example of the output to the journal (every minute).

$ journalctl -f
mai 11 18:09:01 valtin CRON[8665]: (root) CMD ( /var/lib/boinc/pin-genefer-threads | systemd-cat) mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8649 : taskset -pc 0 8649 mai 11 18:09:01 valtin cat[8667]: pid 8649's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8649's new affinity list: 0 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8649 : taskset -pc 0 8650 mai 11 18:09:01 valtin cat[8667]: pid 8650's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8650's new affinity list: 0 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8649 : taskset -pc 1 8651 mai 11 18:09:01 valtin cat[8667]: pid 8651's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8651's new affinity list: 1 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8649 : taskset -pc 2 8652 mai 11 18:09:01 valtin cat[8667]: pid 8652's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8652's new affinity list: 2 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8649 : taskset -pc 3 8653 mai 11 18:09:01 valtin cat[8667]: pid 8653's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8653's new affinity list: 3 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8649 : taskset -pc 4 8654 mai 11 18:09:01 valtin cat[8667]: pid 8654's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8654's new affinity list: 4 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8649 : taskset -pc 5 8655 mai 11 18:09:01 valtin cat[8667]: pid 8655's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8655's new affinity list: 5 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8656 : taskset -pc 6 8656 mai 11 18:09:01 valtin cat[8667]: pid 8656's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8656's new affinity list: 6 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8656 : taskset -pc 6 8657 mai 11 18:09:01 valtin cat[8667]: pid 8657's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8657's new affinity list: 6 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8656 : taskset -pc 7 8658 mai 11 18:09:01 valtin cat[8667]: pid 8658's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8658's new affinity list: 7 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8656 : taskset -pc 8 8659 mai 11 18:09:01 valtin cat[8667]: pid 8659's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8659's new affinity list: 8 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8656 : taskset -pc 9 8660 mai 11 18:09:01 valtin cat[8667]: pid 8660's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8660's new affinity list: 9 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8656 : taskset -pc 10 8661 mai 11 18:09:01 valtin cat[8667]: pid 8661's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8661's new affinity list: 10 mai 11 18:09:01 valtin cat[8667]: boinc genefer process 8656 : taskset -pc 11 8662 mai 11 18:09:01 valtin cat[8667]: pid 8662's current affinity list: 0-11 mai 11 18:09:01 valtin cat[8667]: pid 8662's new affinity list: 11 mai 11 18:09:01 valtin CRON[8664]: pam_unix(cron:session): session closed for user root ... mai 11 18:10:01 valtin CRON[8736]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0) mai 11 18:10:01 valtin CRON[8737]: (root) CMD ( /var/lib/boinc/pin-genefer-threads | systemd-cat) mai 11 18:10:01 valtin CRON[8736]: pam_unix(cron:session): session closed for user root

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155445 - Posted: 15 May 2022 | 11:58:51 UTC - in response to Message 155268.

But there is a drawback with this "3.4.0.2" version. Be aware that when a GFN-19 CPU and a GFN-19 GPU run simultaneously, both are awfully slow since conflicting: it seems that the GPU thread perturbs the CPU threads (memory access, granularity, ...). Therefore the GFN-19 GPU unit could not be run simultaneously.

As discussed in the thread https://www.primegrid.com/forum_thread.php?id=9914, the high CPU
usage on GPU task can be avoided implementing the http://mk.junkyard.one.pl/libsleep.c

How to: https://www.primegrid.com/forum_thread.php?id=7731&nowrap=true#113965

The perfect scalability of multi-threaded GFN-19 is not perturbed any more by GPU computing !

Thank you, guys !

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155512 - Posted: 21 May 2022 | 2:53:30 UTC
Last modified: 21 May 2022 | 3:52:12 UTC

Overview of the final settings for PrimeGrid's Geek Pride Day Challenge 2022

Choosing the right number of concurrent genefer tasks depends on the architecture of the processor and especially on the size of the L3 cache.
https://www.primegrid.com/forum_thread.php?id=9915&nowrap=true#155484
Milles fois merci Yves pour ces précieuses indications et cette nouvelle version optimale de genefer !

On a Ryzen 9 5900x with 32 MB of L3 cache per die (CCX) under Ubuntu, multithreading provides the same scaling for:
2 tasks of 6 threads : 10 MB of L3 cache and 6 cores used for each CCX.
4 tasks of 3 threads : 20 MB of L3 cache and 6 cores used for each CCX.
6 tasks of 2 threads : 30 MB of L3 cache and 6 cores used for each CCX.

1 task of 12 threads would NOT use the same L3 cache and threads would communicate through the infinity fabric.
3 tasks of 4 threads : the first and third tasks would share the same L3 cache but NOT the second one distributed on two CCX. The discrepancy can be easily tested using this configuration.
12 tasks of 1 thread would require too much L3 cache (60 MB per CCX) but this is NOT anymore multithreading and cache sharing, isn't it ?!

pin-genefer-threads

The benefit of thread pinning (setting threads' affinity to cores) is a gain of 20% of the cpu-time without any overheating !) obtained by a proper use of the L1, L2 and L3 caches.

A last slight change to pin-genefer-threads on line 4: grep genefer_linux64 now grep only CPU tasks pids.
genefer_linux64_3.4.0-3 issues one unused thread which spid is pid+1 and its affinity is set to the same as pid.

$ cat /var/lib/boinc/pin-genefer-threads

#!/bin/bash processor=-1 pids=`ps -u boinc | grep genefer_linux64 | awk '{print $1}'` for pid in $pids; do spids=`ps -Tu boinc | grep $pid | awk '{print $2}'` for spid in $spids; do if [[ $(($spid-$pid)) -ne 1 ]] then processor=$(($processor+1)) fi taskget=`taskset -pc $spid | awk '{print $6}'` if [[ $taskget -ne $processor ]]; then echo 'boinc genefer process' $pid ': taskset -pc' $processor $spid taskset -pc $processor $spid fi done done # That's all, folks !!!

For Ubuntu 22.04 LTS, pin-genefer-threads is executed every minute and reports in the journal.

$ cat /etc/crontab
# /etc/crontab: system-wide crontab # Unlike any other crontab you don't have to run the `crontab' # command to install the new version when you edit this file # and files in /etc/cron.d. These files also have username fields, # that none of the other crontabs do. SHELL=/bin/sh # You can also override PATH, but by default, newer versions inherit it from the environment #PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin # Example of job definition: # .---------------- minute (0 - 59) # | .------------- hour (0 - 23) # | | .---------- day of month (1 - 31) # | | | .------- month (1 - 12) OR jan,feb,mar,apr ... # | | | | .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat # | | | | | # * * * * * user-name command to be executed 17 * * * * root cd / && run-parts --report /etc/cron.hourly 25 6 * * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily ) 47 6 * * 7 root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly ) 52 6 1 * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly ) */1 * * * * root /var/lib/boinc/pin-genefer-threads | systemd-cat #

libsleep

The YIELD_SLEEP_TIME environment variable can be used
to tune the duration of the sleep time during which boinc does
not care of the GPU task hence not wasting time on one core.

The GPU task (without libsleep) would perturb the perfect scalability of the
parallelization implemented by Yves Gallot in the new multithreaded genefer.

cat /usr/lib/systemd/system/boinc-client.service
[Unit] Description=Berkeley Open Infrastructure Network Computing Client Documentation=man:boinc(1) After=network-online.target [Service] Type=simple ProtectHome=true ProtectSystem=strict ProtectControlGroups=true ReadWritePaths=-/var/lib/boinc -/etc/boinc-client Nice=10 User=boinc WorkingDirectory=/var/lib/boinc # Standard ExecStart: #ExecStart=/usr/bin/boinc # Verbose ExecStart with libsleep #ExecStart=/bin/sh -c 'YIELD_SLEEP_TIME="500" LD_PRELOAD="/var/lib/boinc/libsleep.so" /usr/bin/boinc --dir /var/lib/boinc-client >/var/log/boinc.log 2>/var/log/boincerr.log' # ExecStart with libsleep without preset YIELD_SLEEP_TIME but default in the code is 1000 ms. ExecStart=/bin/sh -c 'LD_PRELOAD="/var/lib/boinc/libsleep.so" /usr/bin/boinc' ...

No change to libsleep.c.

$ cat /var/lib/boinc/libsleep.c
#include <stdio.h> #include <stdlib.h> #include <unistd.h> /* * To compile run: * gcc -O2 -fPIC -shared -Wl,-soname,libsleep.so -o libsleep.so libsleep.c * * To use: * LD_PRELOAD="./libsleep.so" ./cgminer * * You can configure sleep time by setting * YIELD_SLEEP_TIME environment variable (in microseconds) * Default is 1000usec * Example: * YIELD_SLEEP_TIME="1500" LD_PRELOAD="./libsleep.so" ./cgminer * * Tips are welcome: * Bitcoin: 1FQMFpqnCH1ATPGoHLnmSAXB1gBh9vAKXC * Litecoin: LZiXcRvUr5wJrkfvc9vXvqDwe51YfVPRkv * */ useconds_t yield_sleep_time = 1000; static void __attribute__ ((constructor)) lib_init(void) { int stime = 0; char *env_stime = getenv("YIELD_SLEEP_TIME"); if(env_stime) { stime = atoi(env_stime); if(stime > 0) yield_sleep_time = stime; } printf("libsleep: Sleep time: %uusec\n", yield_sleep_time); } int sched_yield(void) { usleep(yield_sleep_time); return 0; }

That's all, folks !!!

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155514 - Posted: 21 May 2022 | 8:54:38 UTC
Last modified: 21 May 2022 | 9:25:33 UTC

Occupancy, Topology and Scalability as a function of the parameter "Threads per task".
pin-genefer-threads and libsleep enabled, precision boost overdrive negative offset 20.
Parameter = "Multi-threading: Max # of threads for each task" == Threads per task.

Benchmark of Genefer 16 3.25 (genefer_linux64_3.4.0-3)

Occupancy/Topology ______SMT disabled_____ Threads per task : 1 2 3 4 5 6 Number of tasks : 12 6 4 3 2 2 Number of threads: 12 12 12 12 10 12 Free unused cores: 2 Threads CCD #1|#2: 6|6 6|6 6|6 6|6 6|4 6|6 Tasks CCD #1|#2: 6|6 3|3 2|2 2|2 2|1 1|1 Use per L3 cache: ? ? ? ? ? ? Optimal use of L3: yes yes yes NO NO yes Temperatures (°C): Tctl (°C): 75° 76° 76° 79° 82° 77° Tccd1 (°C): 74° 73° 75° 78° 80° 75° Tccd2 (°C): 70° 71° 69° 73° 77° 71° Frequency (GHz): 4.0 4.0 4.1 4.2 4.4 4.2 Avg. Run time (s): 534 318 224 179 150 121 Min. Run time (s): 529 312 217 161 131 118 Max. Run time (s): 540 324 230 209 163 128 Avg. CPU time (s): 527 627 660 703 736 713 Min. CPU time (s): 525 618 640 633 646 697 Max. CPU time (s): 528 642 678 821 803 749 Scal. Avg. Run(%): 1 84% 80% 74% 71% 73% Scal. Max. Run(%): 1 83% 78% 65% 66% 70% Scal. Avg. CPU(%): 1 84% 80% 75% 71% 74% Scal. Max. CPU(%): 1 82% 78% 64% 66% 70% Comments: (a) (b) (c) (a) one of the task uses both CCDS and their L3 caches, threfore this task is significantly slower. (b) two cores are free, precision boost overdrives the frequency (10%) and overheating is observed. (c) runtime is too short, two minutes, for pin-threads-genefer to apply efficiently (every minute). Notes: Frequency: lscpu -e Temperature: sensors (lm-sensors) Topology: lstopo --ps --top --of console (hwloc) CPU time (sum of all participating cores) and Run Time as reported on the "Your account" primegrid's home page.

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155517 - Posted: 21 May 2022 | 12:09:44 UTC
Last modified: 21 May 2022 | 12:12:22 UTC

Occupancy, Topology, Scalability, functions of the parameter "Threads per task"
of the PrimeGrid preferences "Multi-threading: Max # of threads for each task".


Genefer 19 3.25 (genefer_linux64_3.4.0-3) Occupancy/Topology _________________SMT disabled_________________ Threads per task : 1 2 3 4 5 6 12 Number of tasks : 12 6 4 3 2 2 1 Number of threads: 12 12 12 12 10 12 12 Free unused cores: 2 Threads CCD #1|#2: 6|6 6|6 6|6 6|6 6|4 6|6 6|6 Tasks CCD #1|#2: 6|6 3|3 2|2 2|2 2|1 1|1 1|1 L3 cache use (MB): 60 30 20 20 20 10 10 L3 cache use OK? : NO! yes yes NO! NO! yes NO! Avg. Run time (s): 20,880 13,355 6,900 6,633 Avg. CPU time (s): 41,760 39,862 41,400 79,465 Scalability (%): 96% 100% 96% 50% Notes: Run (Elapsed) time and CPU time of "your account" home page.

Profile composite
Volunteer tester
Send message
Joined: 16 Feb 10
Posts: 1022
ID: 55391
Credit: 888,278,303
RAC: 136,448
Discovered 2 mega primesFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2022 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (6,055,323)Cullen LLR Gold: Earned 500,000 credits (776,297)ESP LLR Ruby: Earned 2,000,000 credits (3,433,680)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,443,837)PPS LLR Sapphire: Earned 20,000,000 credits (34,610,067)PSP LLR Turquoise: Earned 5,000,000 credits (6,587,988)SoB LLR Sapphire: Earned 20,000,000 credits (45,081,394)SR5 LLR Turquoise: Earned 5,000,000 credits (6,205,694)SGS LLR Ruby: Earned 2,000,000 credits (3,627,819)TRP LLR Turquoise: Earned 5,000,000 credits (7,078,152)Woodall LLR Amethyst: Earned 1,000,000 credits (1,693,614)321 Sieve (suspended) Emerald: Earned 50,000,000 credits (50,256,050)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,571,178)Generalized Cullen/Woodall Sieve (suspended) Emerald: Earned 50,000,000 credits (50,009,610)PPS Sieve Double Silver: Earned 200,000,000 credits (463,452,443)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,165,888)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,071,454)AP 26/27 Turquoise: Earned 5,000,000 credits (6,798,063)GFN Emerald: Earned 50,000,000 credits (57,113,430)WW Ruby: Earned 2,000,000 credits (4,484,000)PSA Double Bronze: Earned 100,000,000 credits (102,762,384)
Message 155518 - Posted: 21 May 2022 | 12:14:14 UTC
Last modified: 21 May 2022 | 12:14:48 UTC

My experience with systemd files is limited, but I think this will work.
Use Environment lines instead of putting them on the ExecStart line.
Then you would not need to invoke a shell in ExecStart.
Multiple Environment lines are allowed.

[Service]
Environment="YIELD_SLEEP_TIME=1000"
Environment="LD_PRELOAD=/var/lib/boinc/libsleep.so"
ExecStart=/usr/bin/boinc
...

Rather than having a default in the source code, or in addition to the source code,
the default is contained in the systemd file as above. Have the source code error out
if the environment variable is not set or has an unreasonable value.

The user would use an override file with the command "systemctl edit boinc-client"
to set a different sleep time. The override file would contain

[Service]
Environment="YIELD_SLEEP_TIME=5000"

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155523 - Posted: 21 May 2022 | 16:24:05 UTC
Last modified: 21 May 2022 | 16:55:15 UTC

@composite: Indeed it would be more clean and practical in order to dynamically change the YIELD_SLEEP_TIME for testing.

Occupancy, Topology, Scalability as functions of the "Threads per task"=
PrimeGrid preferences "Multi-threading: Max # of threads for each task".

The only one setup reaching the criteria is: 4 tasks of 6 threads! in the specific case of genefer GFN-19 with SMT enabled.


Genefer 19 3.25 (genefer_linux64_3.4.0-3) Occupancy/Topology _________________SMT disabled_________________ Threads per task : 1 2 3 4 5 6 12 Number of tasks : 12 6 4 3 2 2 1 Number of threads: 12 12 12 12 10 12 12 Free unused cores: 2 Threads CCD #1|#2: 6|6 6|6 6|6 6|6 6|4 6|6 6|6 Tasks CCD #1|#2: 6|6 3|3 2|2 2|2 2|1 1|1 1|1 L3 cache use (MB): 60 30 20 20 20 10 10 L3 cache use OK? : NO! yes yes NO! NO! yes NO! Avg. Run time (s): 20,880 13,355 6,900 6,633 Avg. CPU time (s): 41,760 39,862 41,400 79,465 Scalability (%): 96% 100% 96% 50% Occupancy/Topology ___________________SMT enabled_________________ Threads per task : 1 2 3 4 5 6 12 Number of tasks : 24 12 8 6 4 4 2 Number of threads: 24 24 24 24 20 24 24 Free unused cores: 4 Tasks CCD #1|#2: 12|12 6|6 4|4 3|3 3|2 2|2 2|2 Tasks: Even PU #1|#2: 6|6 3|3 2|2 3|3 3|2 T1|T3 2|2 Odd PU #1|#2: 6|6 3|3 2|2 3|3 3|2 T2|T4 2|2 Threads CCD #1|#2: 12|12 12|12 12|12 12|12 12|8 12|12 12|12 Threads: Even PU #1|#2: 6|6 6|6 6|6 6|6 6|4 6|6 6|6 Odd PU #1|#2: 6|6 6|6 6|6 6|6 6|4 6|6 6|6 L3 cache use (MB): 120 60 40 30 30 20 20 cache use < 32MB: NO! NO! NO! yes yes yes yes cache not shared: yes yes yes NO! NO! yes NO! L3 cache use OK? : NO! NO! NO! NO! NO! YES NO! Avg. Run time (s): ? ? ? ? ? 13,800 7,832 Avg. CPU time (s): ? ? ? ? ? 55,200 93,710 Scalability (%): ? ? ? ? ? 100% 59% The only one setup reaching the criteria is: 4 tasks of 6 threads! (in the specific case of genefer GFN-19 with SMT enabled). PU: When Simultaneous MultiThreading (SMT) is enabled, 1 (physical) core is divided in two Processing Units one is even labelled, the second is odd labelled. 2 PUs = 2 instructions pipes in the same physical core = 2 "logical" cores (to hide "memory" latencies). Notes: Run (Elapsed) time and CPU time of "your account" home page.

Profile composite
Volunteer tester
Send message
Joined: 16 Feb 10
Posts: 1022
ID: 55391
Credit: 888,278,303
RAC: 136,448
Discovered 2 mega primesFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2022 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (6,055,323)Cullen LLR Gold: Earned 500,000 credits (776,297)ESP LLR Ruby: Earned 2,000,000 credits (3,433,680)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,443,837)PPS LLR Sapphire: Earned 20,000,000 credits (34,610,067)PSP LLR Turquoise: Earned 5,000,000 credits (6,587,988)SoB LLR Sapphire: Earned 20,000,000 credits (45,081,394)SR5 LLR Turquoise: Earned 5,000,000 credits (6,205,694)SGS LLR Ruby: Earned 2,000,000 credits (3,627,819)TRP LLR Turquoise: Earned 5,000,000 credits (7,078,152)Woodall LLR Amethyst: Earned 1,000,000 credits (1,693,614)321 Sieve (suspended) Emerald: Earned 50,000,000 credits (50,256,050)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,571,178)Generalized Cullen/Woodall Sieve (suspended) Emerald: Earned 50,000,000 credits (50,009,610)PPS Sieve Double Silver: Earned 200,000,000 credits (463,452,443)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,165,888)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,071,454)AP 26/27 Turquoise: Earned 5,000,000 credits (6,798,063)GFN Emerald: Earned 50,000,000 credits (57,113,430)WW Ruby: Earned 2,000,000 credits (4,484,000)PSA Double Bronze: Earned 100,000,000 credits (102,762,384)
Message 155525 - Posted: 21 May 2022 | 21:14:38 UTC - in response to Message 155523.
Last modified: 21 May 2022 | 21:15:25 UTC


PU: When Simultaneous MultiThreading (SMT) is enabled,
1 (physical) core is divided in two Processing Units
one is even labelled, the second is odd labelled.

2 PUs = 2 instructions pipes in the same physical core
= 2 "logical" cores (to hide "memory" latencies).

I'm not picturing your AMD CPU structure clearly.
Can you repost with the ''--no-io --of ascii" options and paste the output into a code block?
For example, this Intel chip has
$ lstopo-no-graphics --no-io --of ascii ┌───────────────────────────────────────────────────────────────────┐ │ Machine (15GB total) │ │ │ │ ┌───────────────────────────────────────────────────────────────┐ │ │ │ Package L#0 │ │ │ │ │ │ │ │ ┌───────────────────────────────────────────────────────────┐ │ │ │ │ │ NUMANode L#0 P#0 (15GB) │ │ │ │ │ └───────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌───────────────────────────────────────────────────────────┐ │ │ │ │ │ L3 (15MB) │ │ │ │ │ └───────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L2 (256KB) │ │ L2 (256KB) │ ├┤ ├┤ ├┤ │ L2 (256KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ 6x total │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ Core L#0 │ │ Core L#1 │ │ Core L#5 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ │ │ │ │ PU L#0 │ │ │ │ PU L#2 │ │ │ │ PU L#10 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#0 │ │ │ │ P#1 │ │ │ │ P#5 │ │ │ │ │ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ │ │ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ │ │ │ │ PU L#1 │ │ │ │ PU L#3 │ │ │ │ PU L#11 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#6 │ │ │ │ P#7 │ │ │ │ P#11 │ │ │ │ │ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ └───────────────────────────────────────────────────────────────┘ │ └───────────────────────────────────────────────────────────────────┘

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155529 - Posted: 22 May 2022 | 3:02:26 UTC
Last modified: 22 May 2022 | 3:19:19 UTC

@composite: Very good idea!
The next 4 diagrams show either the topology and it's use OR the topology alone with SMT On or Off.
All different cases of the benchmark can of course not be reported.

Ryzen 9 5900x, SMT enabled, pin-genefer-threads enabled, 4 GFN-19 threads with 6 threads each.

$ lstopo --ps --top --of ascii

┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Machine (16GB total) │ │ │ │ ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ ├┤╶─┬─────┼┤╶─┬─────┬─────────────┐ │ │ │ Package L#0 │ │3,9 │3,9 │ PCI 01:00.1 │ │ │ │ │ │ │ └─────────────┘ │ │ │ ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ NUMANode L#0 P#0 (16GB) │ │ │ └─────┼┤╶─┬─────┼┤╶───────┬───────────────────┐ │ │ │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ 3,9 │3,9 3,9 │ PCI 04:00.0 │ │ │ │ │ │ │ │ │ │ │ │ ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ ┌───────────────┐ │ │ │ │ │ L3 (32MB) │ │ │ │ │ │ Block nvme0n1 │ │ │ │ │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ 465 GB │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────┐ ┌───────────────────────────────────┐ ┌───────────────────────────────────┐ ┌──────────────────────┐ ┌──────────────────────┐ │ │ │ │ └───────────────┘ │ │ │ │ │ L2 (512KB) │ │ L2 (512KB) │ ├┤ ├┤ ├┤ │ L2 (512KB) │ │ 5070 genefer_linux64 │ │ 5074 genefer_linux64 │ │ │ │ └───────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────┘ └───────────────────────────────────┘ └───────────────────────────────────┘ └──────────────────────┘ └──────────────────────┘ │ │ │ │ │ │ 6x total │ │ └─────┼┤╶───────┬────────────────┐ │ │ │ ┌──────────────────────────────────────────────────────────────────┐ ┌───────────────────────────────────┐ ┌───────────────────────────────────┐ │ │ 0,6 0,6 │ PCI 06:00.0 │ │ │ │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────────┘ └───────────────────────────────────┘ └───────────────────────────────────┘ │ │ │ ┌────────────┐ │ │ │ │ │ │ │ │ Net enp6s0 │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────┐ ┌───────────────────────────────────┐ ┌───────────────────────────────────┐ │ │ │ └────────────┘ │ │ │ │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ │ └────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────┘ └───────────────────────────────────┘ └───────────────────────────────────┘ │ │ │ │ │ │ └─────┼┤╶───────┬──────────────────────┐ │ │ │ ┌──────────────────────────────────────────────────────────────────┐ ┌───────────────────────────────────┐ ┌───────────────────────────────────┐ │ 4,0 4,0 │ PCI 07:00.0 │ │ │ │ │ Core L#0 │ │ Core L#1 │ │ Core L#5 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌──────────┐ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────┐ │ │ ┌───────────────────────────────┐ │ │ ┌───────────────────────────────┐ │ │ │ │ GPU :1.0 │ │ │ │ │ │ │ PU L#0 │ │ │ │ PU L#2 │ │ │ │ PU L#10 │ │ │ │ └──────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#0 │ │ │ │ P#1 │ │ │ │ P#5 │ │ │ │ ┌──────────────────┐ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ CoProc opencl0d0 │ │ │ │ │ │ │ ┌───────────────────────────┐ ┌───────────────────────────┐ │ │ │ │ ┌───────────────────────────┐ │ │ │ │ ┌───────────────────────────┐ │ │ │ │ │ │ │ │ │ │ │ │ │ 5070 genefer_linux64 5070 │ │ 5070 genefer_linux64 5071 │ │ │ │ │ │ 5070 genefer_linux64 5086 │ │ │ │ │ │ 5070 genefer_linux64 5090 │ │ │ │ │ │ 8 compute units │ │ │ │ │ │ │ └───────────────────────────┘ └───────────────────────────┘ │ │ │ │ └───────────────────────────┘ │ │ │ │ └───────────────────────────┘ │ │ │ │ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────┘ │ │ └───────────────────────────────┘ │ │ └───────────────────────────────┘ │ │ │ │ 4036 MB │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────┐ │ │ ┌───────────────────────────────┐ │ │ ┌───────────────────────────────┐ │ │ │ └──────────────────┘ │ │ │ │ │ │ PU L#1 │ │ │ │ PU L#3 │ │ │ │ PU L#11 │ │ │ └──────────────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#12 │ │ │ │ P#13 │ │ │ │ P#17 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌───────────────────────────┐ ┌───────────────────────────┐ │ │ │ │ ┌───────────────────────────┐ │ │ │ │ ┌───────────────────────────┐ │ │ │ │ │ │ │ │ │ 5074 genefer_linux64 5074 │ │ 5074 genefer_linux64 5075 │ │ │ │ │ │ 5074 genefer_linux64 5081 │ │ │ │ │ │ 5074 genefer_linux64 5085 │ │ │ │ │ │ │ │ │ └───────────────────────────┘ └───────────────────────────┘ │ │ │ │ └───────────────────────────┘ │ │ │ │ └───────────────────────────┘ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────┘ │ │ └───────────────────────────────┘ │ │ └───────────────────────────────┘ │ │ │ │ │ └──────────────────────────────────────────────────────────────────┘ └───────────────────────────────────┘ └───────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ L3 (32MB) │ │ │ │ │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────┐ ┌───────────────────────────────────┐ ┌───────────────────────────────────┐ ┌──────────────────────┐ ┌──────────────────────┐ │ │ │ │ │ L2 (512KB) │ │ L2 (512KB) │ ├┤ ├┤ ├┤ │ L2 (512KB) │ │ 5072 genefer_linux64 │ │ 5076 genefer_linux64 │ │ │ │ │ └──────────────────────────────────────────────────────────────────┘ └───────────────────────────────────┘ └───────────────────────────────────┘ └──────────────────────┘ └──────────────────────┘ │ │ │ │ 6x total │ │ │ │ ┌──────────────────────────────────────────────────────────────────┐ ┌───────────────────────────────────┐ ┌───────────────────────────────────┐ │ │ │ │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ │ │ │ └──────────────────────────────────────────────────────────────────┘ └───────────────────────────────────┘ └───────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────┐ ┌───────────────────────────────────┐ ┌───────────────────────────────────┐ │ │ │ │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ │ │ │ └──────────────────────────────────────────────────────────────────┘ └───────────────────────────────────┘ └───────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────┐ ┌───────────────────────────────────┐ ┌───────────────────────────────────┐ │ │ │ │ │ Core L#6 │ │ Core L#7 │ │ Core L#11 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────┐ │ │ ┌───────────────────────────────┐ │ │ ┌───────────────────────────────┐ │ │ │ │ │ │ │ PU L#12 │ │ │ │ PU L#14 │ │ │ │ PU L#22 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#6 │ │ │ │ P#7 │ │ │ │ P#11 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌───────────────────────────┐ ┌───────────────────────────┐ │ │ │ │ ┌───────────────────────────┐ │ │ │ │ ┌───────────────────────────┐ │ │ │ │ │ │ │ │ │ 5072 genefer_linux64 5072 │ │ 5072 genefer_linux64 5073 │ │ │ │ │ │ 5072 genefer_linux64 5091 │ │ │ │ │ │ 5072 genefer_linux64 5095 │ │ │ │ │ │ │ │ │ └───────────────────────────┘ └───────────────────────────┘ │ │ │ │ └───────────────────────────┘ │ │ │ │ └───────────────────────────┘ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────┘ │ │ └───────────────────────────────┘ │ │ └───────────────────────────────┘ │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────┐ │ │ ┌───────────────────────────────┐ │ │ ┌───────────────────────────────┐ │ │ │ │ │ │ │ PU L#13 │ │ │ │ PU L#15 │ │ │ │ PU L#23 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#18 │ │ │ │ P#19 │ │ │ │ P#23 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌───────────────────────────┐ ┌───────────────────────────┐ │ │ │ │ ┌───────────────────────────┐ │ │ │ │ ┌───────────────────────────┐ │ │ │ │ │ │ │ │ │ 5076 genefer_linux64 5076 │ │ 5076 genefer_linux64 5077 │ │ │ │ │ │ 5076 genefer_linux64 5096 │ │ │ │ │ │ 5076 genefer_linux64 5100 │ │ │ │ │ │ │ │ │ └───────────────────────────┘ └───────────────────────────┘ │ │ │ │ └───────────────────────────┘ │ │ │ │ └───────────────────────────┘ │ │ │ │ │ │ │ └──────────────────────────────────────────────────────────────┘ │ │ └───────────────────────────────┘ │ │ └───────────────────────────────┘ │ │ │ │ │ └──────────────────────────────────────────────────────────────────┘ └───────────────────────────────────┘ └───────────────────────────────────┘ │ │ │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ └───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ ┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Host: valtin │ │ │ │ Date: dim. 22 mai 2022 05:00:28 │ └───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155530 - Posted: 22 May 2022 | 3:05:05 UTC
Last modified: 22 May 2022 | 3:05:20 UTC

Ryzen 9 5900x SMT enabled

$ lstopo-no-graphics --no-io --of ascii

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Machine (16GB total) │ │ │ │ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ Package L#0 │ │ │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ NUMANode L#0 P#0 (16GB) │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌───────────────────────────────────────────────────────────┐ ┌───────────────────────────────────────────────────────────┐ │ │ │ │ │ L3 (32MB) │ │ L3 (32MB) │ │ │ │ │ └───────────────────────────────────────────────────────────┘ └───────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L2 (512KB) │ │ L2 (512KB) │ ├┤ ├┤ ├┤ │ L2 (512KB) │ │ L2 (512KB) │ │ L2 (512KB) │ ├┤ ├┤ ├┤ │ L2 (512KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ 6x total 6x total │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ Core L#0 │ │ Core L#1 │ │ Core L#5 │ │ Core L#6 │ │ Core L#7 │ │ Core L#11 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ │ │ │ │ PU L#0 │ │ │ │ PU L#2 │ │ │ │ PU L#10 │ │ │ │ PU L#12 │ │ │ │ PU L#14 │ │ │ │ PU L#22 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#0 │ │ │ │ P#1 │ │ │ │ P#5 │ │ │ │ P#6 │ │ │ │ P#7 │ │ │ │ P#11 │ │ │ │ │ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ │ │ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ │ │ │ │ PU L#1 │ │ │ │ PU L#3 │ │ │ │ PU L#11 │ │ │ │ PU L#13 │ │ │ │ PU L#15 │ │ │ │ PU L#23 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#12 │ │ │ │ P#13 │ │ │ │ P#17 │ │ │ │ P#18 │ │ │ │ P#19 │ │ │ │ P#23 │ │ │ │ │ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Host: valtin │ │ │ │ Date: dim. 22 mai 2022 05:03:20 │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155531 - Posted: 22 May 2022 | 3:08:31 UTC

Ryzen 9 5900x SMT disabled
$ lstopo-no-graphics --no-io --of ascii

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Machine (16GB total) │ │ │ │ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ Package L#0 │ │ │ │ │ │ │ │ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ NUMANode L#0 P#0 (16GB) │ │ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌───────────────────────────────────────────────────────────┐ ┌───────────────────────────────────────────────────────────┐ │ │ │ │ │ L3 (32MB) │ │ L3 (32MB) │ │ │ │ │ └───────────────────────────────────────────────────────────┘ └───────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L2 (512KB) │ │ L2 (512KB) │ ├┤ ├┤ ├┤ │ L2 (512KB) │ │ L2 (512KB) │ │ L2 (512KB) │ ├┤ ├┤ ├┤ │ L2 (512KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ 6x total 6x total │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ L1d (32KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ Core L#0 │ │ Core L#1 │ │ Core L#5 │ │ Core L#6 │ │ Core L#7 │ │ Core L#11 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ │ │ │ │ PU L#0 │ │ │ │ PU L#1 │ │ │ │ PU L#5 │ │ │ │ PU L#6 │ │ │ │ PU L#7 │ │ │ │ PU L#11 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#0 │ │ │ │ P#1 │ │ │ │ P#5 │ │ │ │ P#6 │ │ │ │ P#7 │ │ │ │ P#11 │ │ │ │ │ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Host: valtin │ │ │ │ Date: dim. 22 mai 2022 05:06:57 │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155532 - Posted: 22 May 2022 | 3:11:19 UTC
Last modified: 22 May 2022 | 3:19:37 UTC

Ryzen 9 5900x SMT disabled, pin-genefer-threads of 2 GFN-19 tasks of 6 threads each

$ lstopo --ps --top --of ascii

┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Machine (16GB total) │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ ├┤╶─┬─────┼┤╶─┬─────┬─────────────┐ │ │ │ Package L#0 │ │3,9 │3,9 │ PCI 01:00.1 │ │ │ │ │ │ │ └─────────────┘ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ NUMANode L#0 P#0 (16GB) │ │ │ └─────┼┤╶─┬─────┼┤╶───────┬───────────────────┐ │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ 3,9 │3,9 3,9 │ PCI 04:00.0 │ │ │ │ │ │ │ │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ ┌───────────────┐ │ │ │ │ │ L3 (32MB) │ │ │ │ │ │ Block nvme0n1 │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ 465 GB │ │ │ │ │ ┌───────────────────────────────────────────────────────────────────────────┐ ┌────────────────────────────────────────────┐ ┌────────────────────────────────────────────┐ ┌──────────────────────┐ │ │ │ │ └───────────────┘ │ │ │ │ │ L2 (512KB) │ │ L2 (512KB) │ ├┤ ├┤ ├┤ │ L2 (512KB) │ │ 1645 genefer_linux64 │ │ │ │ └───────────────────┘ │ │ │ └───────────────────────────────────────────────────────────────────────────┘ └────────────────────────────────────────────┘ └────────────────────────────────────────────┘ └──────────────────────┘ │ │ │ │ │ │ 6x total │ │ └─────┼┤╶───────┬────────────────┐ │ │ │ ┌─────────────┐ ┌───────────────────────────┐ ┌───────────────────────────┐ ┌─────────────┐ ┌───────────────────────────┐ ┌─────────────┐ ┌───────────────────────────┐ │ │ 0,6 0,6 │ PCI 06:00.0 │ │ │ │ │ L1d (32KB) │ │ 1645 genefer_linux64 1645 │ │ 1645 genefer_linux64 1646 │ │ L1d (32KB) │ │ 1645 genefer_linux64 1660 │ │ L1d (32KB) │ │ 1645 genefer_linux64 1664 │ │ │ │ │ │ │ │ └─────────────┘ └───────────────────────────┘ └───────────────────────────┘ └─────────────┘ └───────────────────────────┘ └─────────────┘ └───────────────────────────┘ │ │ │ ┌────────────┐ │ │ │ │ │ │ │ │ Net enp6s0 │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ └────────────┘ │ │ │ │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ │ └────────────────┘ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ └─────┼┤╶───────┬──────────────────────┐ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ 4,0 4,0 │ PCI 07:00.0 │ │ │ │ │ Core L#0 │ │ Core L#1 │ │ Core L#5 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌──────────┐ │ │ │ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ │ GPU :1.0 │ │ │ │ │ │ │ PU L#0 │ │ │ │ PU L#1 │ │ │ │ PU L#5 │ │ │ │ └──────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#0 │ │ │ │ P#1 │ │ │ │ P#5 │ │ │ │ ┌──────────────────┐ │ │ │ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ │ │ CoProc opencl0d0 │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ │ │ │ 8 compute units │ │ │ │ │ ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ │ │ L3 (32MB) │ │ │ │ 4036 MB │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ └──────────────────┘ │ │ │ │ │ └──────────────────────┘ │ │ │ ┌───────────────────────────────────────────────────────────────────────────┐ ┌────────────────────────────────────────────┐ ┌────────────────────────────────────────────┐ ┌──────────────────────┐ │ │ │ │ │ L2 (512KB) │ │ L2 (512KB) │ ├┤ ├┤ ├┤ │ L2 (512KB) │ │ 1653 genefer_linux64 │ │ │ │ │ └───────────────────────────────────────────────────────────────────────────┘ └────────────────────────────────────────────┘ └────────────────────────────────────────────┘ └──────────────────────┘ │ │ │ │ 6x total │ │ │ │ ┌─────────────┐ ┌───────────────────────────┐ ┌───────────────────────────┐ ┌─────────────┐ ┌───────────────────────────┐ ┌─────────────┐ ┌───────────────────────────┐ │ │ │ │ │ L1d (32KB) │ │ 1653 genefer_linux64 1653 │ │ 1653 genefer_linux64 1654 │ │ L1d (32KB) │ │ 1653 genefer_linux64 1668 │ │ L1d (32KB) │ │ 1653 genefer_linux64 1672 │ │ │ │ │ └─────────────┘ └───────────────────────────┘ └───────────────────────────┘ └─────────────┘ └───────────────────────────┘ └─────────────┘ └───────────────────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ L1i (32KB) │ │ L1i (32KB) │ │ L1i (32KB) │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ │ │ Core L#6 │ │ Core L#7 │ │ Core L#11 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ │ │ │ │ PU L#6 │ │ │ │ PU L#7 │ │ │ │ PU L#11 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ P#6 │ │ │ │ P#7 │ │ │ │ P#11 │ │ │ │ │ │ │ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │ │ │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Host: valtin │ │ │ │ Date: dim. 22 mai 2022 05:09:06 │ └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155533 - Posted: 22 May 2022 | 4:01:16 UTC - in response to Message 155525.
Last modified: 22 May 2022 | 4:27:19 UTC

pin-genefer-threads is useful for
- processors with multiple dies (Ryzen 9) or processors (Intel 12900K, 12700K) with hybrid Performant and Efficient cores,
- or AMD Epyc / Intel Xeon platforms with multiple physical processors (sockets) and Non-Uniform Memory Access (NUMA).
Why ?
- a task should run only (with its threads) using the same L3 cache,
- each thread (of this task) is pinned to a core and keep the same L1d, L1i, L2
(all of these cores of the same task sharing the same L3 cache),
thus the core access always the same fastest "memory" (caches and DDR) (20% gain).
Each thread also uses the same AVX2 unit throughout the duration of the task (no gain).
- the memory required by the tasks using the same L3 cache should not exceed its capacity,
as pointed out by Yves Gallot (1 GFN-19: 10MB, Ryzen 9 L3 cache 32MB, Ryzen 9 5850X3D 96 MB)
Intel Alder Lake
pin-genefer-threads has NOT been tested on 12900K, 12700K, 12600K and probably requires changes
to account for the topology of the (Golden Cove + Gracemont) cores (8+8) (8+4) (6+4) respectively.
Windows
A similar approach is likely possible to be implemented under windows using the equivalent of taskset
indicating to windows task scheduler how to distribute the tasks, how to deal with (pin) genefer threads.
LLR
Just modify "grep genefer_linux" by the LLR application (in /var/lib/boinc/projects/www.primegrid.com).

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155537 - Posted: 22 May 2022 | 5:18:53 UTC
Last modified: 22 May 2022 | 5:23:29 UTC

@composite:

Use Environment lines instead of putting them on the ExecStart line.
Then you would not need to invoke a shell in ExecStart.
Multiple Environment lines are allowed.

[Service]
Environment="YIELD_SLEEP_TIME=1000"
Environment="LD_PRELOAD=/var/lib/boinc/libsleep.so"
ExecStart=/usr/bin/boinc
...

Rather than having a default in the source code, or in addition to the source code,
the default is contained in the systemd file as above. Have the source code error out
if the environment variable is not set or has an unreasonable value.

The user would use an override file with the command "systemctl edit boinc-client"
to set a different sleep time. The override file would contain

[Service]
Environment="YIELD_SLEEP_TIME=5000"

Setting up libsleep (update)
First option is to use an override file as proposed by composite : "systemctl edit boinc-client".
The override file avoids messing up the setup.
OR
Second option is to modify the original file of the service:

$ sudo vi /usr/lib/systemd/system/boinc-client.service
[Unit] Description=Berkeley Open Infrastructure Network Computing Client Documentation=man:boinc(1) After=network-online.target [Service] Type=simple ProtectHome=true ProtectSystem=strict ProtectControlGroups=true ReadWritePaths=-/var/lib/boinc -/etc/boinc-client Nice=10 User=boinc WorkingDirectory=/var/lib/boinc Environment="YIELD_SLEEP_TIME=1000" Environment="LD_PRELOAD=/var/lib/boinc/libsleep.so" ExecStart=/usr/bin/boinc #ExecStart=/bin/sh -c 'YIELD_SLEEP_TIME="500" LD_PRELOAD="/var/lib/boinc/libsleep.so" /usr/bin/boinc --dir /var/lib/boinc-client >/var/log/boinc.log 2>/var/log/boincerr.log' #ExecStart=/bin/sh -c 'LD_PRELOAD="/var/lib/boinc/libsleep.so" /usr/bin/boinc' ExecStop=/usr/bin/boinccmd --quit ExecReload=/usr/bin/boinccmd --read_cc_config ExecStopPost=/bin/rm -f lockfile ...

In any case, do not forget to:
$ systemctl daemon-reload
$ systemctl restart boinc-client.service

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 712
ID: 164101
Credit: 305,166,630
RAC: 0
GFN Double Silver: Earned 200,000,000 credits (305,166,630)
Message 155545 - Posted: 22 May 2022 | 12:15:05 UTC - in response to Message 155533.

Intel Alder Lake
pin-genefer-threads has NOT been tested on 12900K, 12700K, 12600K and probably requires changes to account for the topology of the (Golden Cove + Gracemont) cores (8+8) (8+4) (6+4) respectively.

I don't think that it's needed. E cores are too slow for PrimeGrid apps: two 128-bit FP units for Gracemont vs three 256-bit FP units for Golden Cove.
Intel Thread Director is efficient and 8 threads run on the 8 P cores and E cores are not used.

Threads per task : 1 2 3 4 5 6 12 Number of tasks : 12 6 4 3 2 2 1 Number of threads: 12 12 12 12 10 12 12 Free unused cores: 2 Threads CCD #1|#2: 6|6 6|6 6|6 6|6 6|4 6|6 6|6 Tasks CCD #1|#2: 6|6 3|3 2|2 2|2 2|1 1|1 1|1 L3 cache use (MB): 60 30 20 20 20 10 10 L3 cache use OK? : NO! yes yes NO! NO! yes NO! Avg. Run time (s): 20,880 13,355 6,900 6,633

If I convert this table to throughput, I find
Threads per task: 2 3 6 12 Number of tasks: 6 4 2 1 GFN-19/day: 24.8 25.8 25.0 13.0

The fastest configuration is 4 tasks x 3 threads. But 2 x 6 is about as fast and the likelihood of being first is greater.

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155554 - Posted: 22 May 2022 | 15:19:57 UTC - in response to Message 155545.
Last modified: 22 May 2022 | 15:30:58 UTC

If I convert this table to throughput, I find
Threads per task: 2 3 6 12 Number of tasks: 6 4 2 1 GFN-19/day: 24.8 25.8 25.0 13.0

The fastest configuration is 4 tasks x 3 threads. But 2 x 6 is about as fast and the likelihood of being first is greater.

I agree that the four percent difference is indeed not significant in the presented tables.
As you probably observed, it is the dispersion of genefer from one sample to the other.
The table is not build on enough samples to be statistically significant. The challenge will help!

Let's conclude that the following configurations provide the same throughput:
- with SMT disabled, 2 tasks of 6 threads, 4 tasks of 3 threads and 6 tasks of 2 threads
- with SMT enabled, 4 tasks of 6 threads

Without loss of throughput, it is the choice of the cruncher to prefer (SMT disabled)
- long runtime 6 tasks of 2 threads or
- rapid return 2 tasks of 6 threads or
- a balanced option 4 tasks of 3 threads.
If SMT is enabled, the same throughput is achieved only in the case 4 tasks of 6 threads.
Notice that it takes the same wall (run) time as the balanced option 4 tasks of 3 threads!

As underlined by Yves, 2 tasks of 6 threads with SMT off provides
the minimum return time, hence maximizing the likelihood of being first !

Bonne chance à toutes et à tous pour le Geek Pride Day Challenge !

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155567 - Posted: 23 May 2022 | 5:11:42 UTC
Last modified: 23 May 2022 | 5:50:06 UTC

Tasks returned in the last 24 hours

SETUP: PrimeGrid Preferences / Job Control and Multi-threading / Multi-threading: Max # of threads for each Task RESULT: Throughput (Tasks/day) = 86,400 (s/day) / average Elapsed time (s) x Concurrent Tasks (number of Tasks running at the same time) V V SETUP RESULT Date Time Sub-project Host Concurrent threads Tasks Firsts First% Send/receive duration Elapsed time Throughput CPU time SMT Tasks /Task Done average/ minimum/maximum average/ minimum/maximum Tasks/day average/ minimum/maximum 23/05 07:00 Genefer 19 valtin gpu 1 0 1 0 0.00% 26,918 / 26,918 / 26,918 15,153 / 15,153 / 15,153 5.7 267 / 267 / 267 23/05 07:00 Genefer 19 valtin off 2 Tasks 6 thrds 8 4 50.00% 9,530 / 6,868 / 14,399 7,072 / 6,814 / 7,380 24.4 2Tx6t 42,198 / 40,803 / 43,927

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155569 - Posted: 23 May 2022 | 6:22:31 UTC
Last modified: 23 May 2022 | 6:33:23 UTC

Reminder (as explained on the Geek Pride Day Challenge thread (everything is named thread (!)))
If you don't know how to (or don't want to) disable Symmetric MultiThreading,
you can setup in PrimeGrid Preferences / Job Control and Multi-threading the following two parameters:
Multi-threading: Max # of threads for each task = a quarter of the total number of (hyperthreaded) cores = 6 if 24 hyperthreaded cores for 12 physical cores
Max # of simultaneous PrimeGrid tasks = half of the total number of (hyperthreaded) cores divided by the number of threads per Task = 2 if 24 hyperthreaded cores for 12 physical cores

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155570 - Posted: 23 May 2022 | 6:51:33 UTC
Last modified: 23 May 2022 | 6:54:27 UTC

Tasks returned in the last 24 hours

Setup: PrimeGrid Preferences / Job Control and Multi-threading / Multi-threading: Max # of threads for each Task Result: Throughput (Tasks/day) = 86,400 (s/day) / average Elapsed time (s) x Concurrent Tasks (Maximum # of simultaneous PrimeGrid Tasks) V V SETUP RESULT Date Time Sub-project Host Concurrent threads Tasks Firsts First% Send/receive duration Elapsed time Throughput CPU time SMT Tasks /Task Done average/ minimum/maximum average/ minimum/maximum Tasks/day average/ minimum/maximum 23/05 07:00 Genefer 19 valtin gpu 1 0 1 0 0.00% 26,918 / 26,918 / 26,918 15,153 / 15,153 / 15,153 5.7 267 / 267 / 267 23/05 09:00 Genefer 19 valtin off 4 Tasks 3 thrds 4 4 100.00 13,365 / 13,312 / 13,420 13,356 / 13,303 / 13,411 25.9 4Tx3t 39,976 / 39,857 / 40,115 23/05 07:00 Genefer 19 valtin off 2 Tasks 6 thrds 8 4 50.00% 9,530 / 6,868 / 14,399 7,072 / 6,814 / 7,380 24.4 2Tx6t 42,198 / 40,803 / 43,927

Less threads seems to yield better throughput ... to be confirmed.

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155579 - Posted: 24 May 2022 | 4:55:04 UTC
Last modified: 24 May 2022 | 5:26:09 UTC

Tasks returned in the last 24 hours

Setup: PrimeGrid Preferences / Job Control and Multi-threading / Multi-threading: Max # of threads for each Task Result: Throughput (Tasks/day) = 86,400 (s/day) / average Elapsed time (s) x Concurrent Tasks (Maximum # of simultaneous PrimeGrid Tasks) V V SETUP RESULT Date Time Sub-project Host Concurrent threads Tasks Firsts First% Send/receive duration Elapsed time Throughput CPU time SMT Tasks /Task Done average/ minimum/maximum average/ minimum/maximum Tasks/day average/ minimum/maximum 23/05 07:00 Genefer 19 valtin gpu 1 0 1 0 0.00 26,918 / 26,918 / 26,918 15,153 / 15,153 / 15,153 5.7 267 / 267 / 267 23/05 07:00 Genefer 19 valtin off 4 Tasks 3 thrds 4 4 100.00 13,365 / 13,312 / 13,420 13,356 / 13,303 / 13,411 25.9 4Tx3t 39,976 / 39,857 / 40,115 23/05 07:00 Genefer 19 valtin off 2 Tasks 6 thrds 8 4 50.00 9,530 / 6,868 / 14,399 7,072 / 6,814 / 7,380 24.4 2Tx6t 42,198 / 40,803 / 43,927 24/05 07:00 Genefer 19 valtin gpu 1 0 2 2 100.00 42,410 / 35,418 / 49,402 15,139 / 15,098 / 15,180 5.7 253 / 249 / 257 24/05 07:00 Genefer 19 valtin off 4 Tasks 3 thrds 8 8 100.00 13,565 / 13,312 / 14,049 13,555 / 13,303 / 14,038 25.5 4Tx3t 40,520 / 39,857 / 41,912 24/05 07:00 Genefer 19 valtin off 2 Tasks 6 thrds 2 2 100.00 6,799 / 6,793 / 6,805 6,784 / 6,779 / 6,789 25.5 2Tx6t 40,630 / 40,616 / 40,645 24/05 07:00 Genefer 17 valtin off 12 Tsks 1 thrd 12 2 16.67 2,281 / 2,250 / 2,321 2,262 / 2,242 / 2,280 458 12Tx1t 2,256 / 2,237 / 2,270 24/05 07:00 Genefer 17 valtin off 6 Tasks 2 thrds 6 2 33.33 1,351 / 1,330 / 1,386 1,293 / 1,276 / 1,316 401 6Tx1t 2,578 / 2,544 / 2,625 24/05 07:00 Genefer 16 valtin off 12 Tsks 1 thrd 12 12 100.00 559 / 536 / 714 524 / 520 / 526 1978 12Tx1t 520 / 519 / 521 24/05 07:00 Genefer 16 valtin off 6 Tasks 2 thrds 6 5 83.33 377 / 358 / 391 311 / 310 / 312 1667 6Tx2t 618 / 614 / 620 24/05 07:00 Genefer 16 valtin off 4 Tasks 3 thrds 6 2 33.33 322 / 239 / 462 220 / 218 / 223 1571 4Tx3t 653 / 645 / 663

The GPU decreases the throughput:
without GPU: 2Tx6t 25.5 Tasks/day
or with GPU: 2Tx6t 24.4 Tasks/day

The GPU task and/or any other task take CPU time from any core from time to time because
cores are NOT reserved exclusively by boinc or genefer even if genefer threads are pinned !
Even if only one thread is slightly slowed down, the loss of time unbalances parallel execution.
The scalability is perfect without any perturbation (including background and users tasks).
Use another device to navigate in PrimeGrid Challenge Stats, to hear music, to calculate stats, ...

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 712
ID: 164101
Credit: 305,166,630
RAC: 0
GFN Double Silver: Earned 200,000,000 credits (305,166,630)
Message 155583 - Posted: 24 May 2022 | 8:43:39 UTC - in response to Message 155579.
Last modified: 24 May 2022 | 8:50:06 UTC

The GPU decreases the throughput:
without GPU: 2Tx6t 25.5 Tasks/day
or with GPU: 2Tx6t 24.4 Tasks/day

On Windows, with Intel processors, I noticed that hyperthreading improves the throughput with a GPU app.
Computing preferences must be set to "Use at most 50% of the CPUs". Genefer threads can be pinned to even logical cores, i.e. to different physical cores. The GPU task/driver will run on an odd logical core: the few instructions are interleaved with genefer code. Since genefer code is executed mainly on FP units and the GPU app code on integer units, it may not slow down too much the CPU app. It's not perfect because they still have to share memory operations but it helps.

pascaltec
Send message
Joined: 20 Jan 11
Posts: 56
ID: 82203
Credit: 91,698,156
RAC: 40,001
321 LLR Amethyst: Earned 1,000,000 credits (1,506,146)Cullen LLR Gold: Earned 500,000 credits (511,610)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (504,769)PPS LLR Amethyst: Earned 1,000,000 credits (1,700,176)SR5 LLR Amethyst: Earned 1,000,000 credits (1,315,022)SGS LLR Gold: Earned 500,000 credits (616,966)TRP LLR Amethyst: Earned 1,000,000 credits (1,497,999)Woodall LLR Amethyst: Earned 1,000,000 credits (1,016,553)321 Sieve (suspended) Silver: Earned 100,000 credits (104,907)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (2,851,173)PPS Sieve Emerald: Earned 50,000,000 credits (66,071,136)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (30,358)TRP Sieve (suspended) Silver: Earned 100,000 credits (287,543)AP 26/27 Gold: Earned 500,000 credits (606,450)GFN Turquoise: Earned 5,000,000 credits (8,600,009)WW Ruby: Earned 2,000,000 credits (4,500,000)
Message 155678 - Posted: 29 May 2022 | 7:30:39 UTC
Last modified: 29 May 2022 | 7:33:18 UTC

PrimeGrid's 2022 Challenge Series - Geek Pride Day Challenge - May 25 18:00:00 to May 30 17:59:59 - Statistics of throughput

Tasks returned in the last 24 hours: Throughput (Tasks/day) = 86,400 (s/day) / average Elapsed time (s) x Concurrent Tasks (Number of simultaneous PrimeGrid Tasks)

Sub-project Host Threads Tasks Firsts First% Send/receive duration Elapsed time CPU time Throughput 26/05 20:00 /Task average/ minimum/maximum average/ minimum/maximum average/ minimum/maximum Tasks/day Genefer 19 4670S 4 2 0 0.00 25,247 / 23,570 / 26,923 25,123 / 23,561 / 26,685 99,037 / 92,850 /105,225 3.44 Genefer 19 5900X 2 12\_24 9 75.00 31,731 / 24,914 / 39,135 20,958 / 19,752 / 22,162 41,105 / 40,131 / 41,830 24.73\_25.09 Genefer 19 5900X 6 12/ 8 66.67 7,863 / 6,782 / 12,416 6,790 / 6,774 / 6,841 40,581 / 40,480 / 40,716 25.45/ Genefer 19 GTX 960 5 4 80.00 26,064 / 15,452 / 43,639 15,312 / 15,183 / 15,374 276 / 232 / 317 5.64 Genefer 19 GTX 560 1 1 100.00 25,625 / 25,625 / 25,625 25,615 / 25,615 / 25,615 25,593 / 25,593 / 25,593 3.37 32 22 68.75 37.54 Sub-project Host Threads Tasks Firsts First% Send/receive duration Elapsed time CPU time Throughput 27/05 06:00 /Task average/ minimum/maximum average/ minimum/maximum average/ minimum/maximum Tasks/day Genefer 19 4670S 4 3 0 0.00 28,086 / 23,570 / 33,765 26,740 / 23,561 / 29,973 105,733/ 92,850 /119,124 3.23 Genefer 19 5900X 2 6\ 4 66.67 38,548 / 38,099 / 39,135 20,388 / 19,752 / 20,925 40,926 / 40,131 / 41,544 25.42\ Genefer 19 5900X 3 8 \_26 3 37.50 31,173 / 31,133 / 31,197 15,147 / 14,514 / 15,871 42,146 / 40,870 / 43,587 22.81 \_24.63 Genefer 19 5900X 6 12_/ 8 66.67 7,863 / 6,782 / 12,416 6,790 / 6,774 / 6,841 40,581 / 40,480 / 40,716 25.45_/ Genefer 19 GTX 960 6 3 50.00 28,892 / 15,452 / 43,639 15,290 / 15,183 / 15,374 259 / 232 / 317 5.65 Genefer 19 GTX 560 2 1 50.00 32,402 / 25,625 / 39,178 25,596 / 25,578 / 25,615 25,574 / 25,556 / 25,593 3.37 37 19 51.35 36.88 Sub-project Host Threads Tasks Firsts First% Send/receive duration Elapsed time CPU time Throughput 27/05 08:00 /Task average/ minimum/maximum average/ minimum/maximum average/ minimum/maximum Tasks/day Genefer 19 4670S 4 3 0 0.00 28,086 / 23,570 / 33,765 26,740 / 23,561 / 29,973 105,733/ 92,850 /119,124 3.23 Genefer 19 5900X 3 12\_24 4 33.33 35,234 / 31,133 / 43,432 14,513 / 13,219 / 15,871 41,306 / 39,495 / 43,587 23.81\_24.63 Genefer 19 5900X 6 12/ 8 66.67 7,863 / 6,782 / 12,416 6,790 / 6,774 / 6,841 40,581 / 40,480 / 40,716 25.45/ Genefer 19 GTX 960 6 2 33.33 31,202 / 15,452 / 43,639 15,315 / 15,206 / 15,374 247 / 232 / 287 5.64 Genefer 19 GTX 560 3 1 33.33 38,637 / 25,625 / 51,108 25,595 / 25,578 / 25,615 25,573 / 25,556 / 25,593 3.37 36 15 41.66 36.87 Sub-project Host Threads Tasks Firsts First% Send/receive duration Elapsed time CPU time Throughput 27/05 20:00 /Task average/ minimum/maximum average/ minimum/maximum average/ minimum/maximum Tasks/day Genefer 19 4670S 4 3 1 33.33 39,200 / 32,371 / 51,463 28,498 / 23,162 / 32,360 111,310/ 91,782 / 123,02 3.18 Genefer 19 5900X 3 12 4 33.33 35,234 / 31,133 / 43,432 14,513 / 13,219 / 15,871 41,306 / 39,495 / 43,587 23.81 Genefer 19 5900X 6 12 3 25.00 8,489 / 6,840 / 13,132 6,815 / 6,770 / 6,891 40,767 / 40,511 / 41,226 25.35 Genefer 19 GTX 960 6 1 16.67 27,899 / 15,451 / 41,239 15,322 / 15,206 / 15,404 243 / 233 / 255 5.64 Genefer 19 GTX 560 3 0 0.00 38,857 / 26,286 / 51,108 25,604 / 25,578 / 25,641 25,581 / 25,556 / 25,617 3.37 36 9 36.77 Sub-project Host Threads Tasks Firsts First% Send/receive duration Elapsed time CPU time Throughput 29/05 09:00 /Task average/ minimum/maximum average/ minimum/maximum average/ minimum/maximum Tasks/day Genefer 19 4670S 4 2 0 0.00 40,220 / 24,493 / 55,947 27,140 / 24,481 / 29,798 106,184/ 94,966 /117,401 3.18 Genefer 19 5900X 6 24 7 29.17 25,659 / 6,827 / 56,518 6,847 / 6,760 / 6,941 40,919 / 40,472 / 41,512 25.23 Genefer 19 GTX 960 5 2 40.00 40,206 / 16,809 / 70,794 15,394 / 15,385 / 15,405 296 / 237 / 377 5.61 Genefer 19 GTX 560 2 1 50.00 25,703 / 25,678 / 25,728 25,642 / 25,627 / 25,656 25,631 / 25,627 / 25,635 3.37 33 10 30.00 37.39

Post to thread

Message boards : Generalized Fermat Prime Search : Multi-threaded GFN-19 3.4.0.2

[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2022 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 1.36, 1.56, 1.28
Generated 15 Aug 2022 | 3:14:59 UTC