Issue with PLE coupling between Code Saturne and Syrthes

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
kenneth
Posts: 11
Joined: Sun Jul 18, 2021 9:23 pm

Issue with PLE coupling between Code Saturne and Syrthes

Post by kenneth »

Hi all and Yvan,

I am having issues with the PLE coupling between code_Saturne version 7 and Syrthes 5.0.8. The problem arises when I try to use multiple processors on the Code_Saturne side.

Essentially, it seems for ranks beside the root rank, the locating is wrong (Code_Saturne side) and If I run in serial the location is as expected. (I compare against our old implementation using Code_saturne 4 and Syrthes 4). To illustrate this, I have included contour plots for the located solid areas in one of the fluid domain sub-regions. The first figure is for the serial code saturne computation (result is expected) and the second figure uses two ranks on the Code_Saturne side. On the second rank, the located areas are slightly off. Please note, a single Syrthes rank is used in both the tests.
Serial.png
Two-processes.png
An additional caveat is I have altered the cs_syr4_coupling_recv_tsolid function (), specifically the call to ple_locator_exchange_point_var() for the volume coupled cells. Of note is we recover the passed data based on distant_var (of size n-dist points) and recover the located elements for the distant points using ple_locator_get_dist_locations(). Please find attached the cs_syr4_coupling.c file in zip folder. The data received here is what I use to compute the solid area shown in the preceding figures.

I would be grateful for any ideas or help on where I may have gone wrong.

Thank you and kind regards,

Kenneth
Attachments
SYR4.zip
(254.23 KiB) Downloaded 115 times
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Issue with PLE coupling between Code Saturne and Syrthes

Post by Yvan Fournier »

Hello,

Your modifications to the base code are quite significant, so it is not easy to check if there is as error in the code itself or if it may be due to changes in the PLE code. Change of indexing from 1-based to 0-based in some areas could lead to an error if you missed an update somewhere, though you would have chances of also seeing it in serial mode.

There have not been that many changes in PLE over the last few years except a few bug fixes and portability and minor documentation improvements, so I do not know what could cause this.

In general, to check for bugs in coupling, I exchange a well-known field, for example face or cell center coordinates, whose visualization immediately tells you whether it is consistent or not.

I do not understand how your use of ple_locator_exchange_point_var line 1946 works, as it is a reverse exchange (often trickier to use), and recv_buf is allocated, but nothing seems exchanged as "local_var".

Best regards,

Yvan
kenneth
Posts: 11
Joined: Sun Jul 18, 2021 9:23 pm

Re: Issue with PLE coupling between Code Saturne and Syrthes

Post by kenneth »

Hi Yvan,

Thank you for your help. Yes, it is a bit perplexing as to what may be the cause of this. I'll continue to try debugging this and carefully checking through the changes.

Regarding the call to ple_locator_exchange_point_var(), we have modified what Syrthes sends, as shown in the code_snippet below. In this exchange, Syrthes is only sending the local temperatures with distant_var set as null. On the Code_Saturne side, we are only receiving. We need the variable on the code_saturne side based on the distant locations, as we would later use these to compute a sub-channel averaged temperature.

Syrthes send function
static void
_send_elt_var_hybrid(syr_cfd_interpolation_t *ip,
const double *t_face,
const double *s_face)
{
int i;

double *var_send = NULL;

if (ip == NULL)
return;

PLE_MALLOC(var_send, ip->n_coupled_elts*2, double);

for (i = 0; i < ip->n_coupled_elts; i++) {
var_send[i*2] = t_face;
var_send[i*2+1] = s_face;
}

ple_locator_exchange_point_var(ip->locator,
NULL,
var_send,
NULL,
sizeof(double),
2,
1);

PLE_FREE(var_send);
}

Thank you again and best regards,

Kenneth
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Issue with PLE coupling between Code Saturne and Syrthes

Post by Yvan Fournier »

Hello,

Ok that is clearer. In any case, as suggested for debugging, exchanging a quantity such as cell centers (from the Syrthes side in this case) can help check if the issue is in the location/exchange (in which case the visualized field would not look like a regular gradient), or in the data preparation (where parallel reductions may come into play).

Best regards,

Yvan
kenneth
Posts: 11
Joined: Sun Jul 18, 2021 9:23 pm

Re: Issue with PLE coupling between Code Saturne and Syrthes

Post by kenneth »

Hi Yvan,

Just a quick update to mention, I managed to get the coupling working by switching the exchange algorithm. Instead of using the _exchange_point_var_distant_asyn, I used the function _exchange_point_var_distant. I did this by altering the _ple_locator_async_threshold value to 0. Unfortunately, I have not been able to identify why the asynchronous exchange gives me scrambled results (for all the other ranks barring the root rank). If you have any ideas would be interesting to know.

Many thanks,

Kenneth
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Issue with PLE coupling between Code Saturne and Syrthes

Post by Yvan Fournier »

Hello,

Thanks for the feedback. I don't have any explanation other than a bug, so I'll look into this soon. Luckily, this is one of the smaller portions of the code.

Also, I seem to remember you were using the reverse mode in quite a few messages, while this is rarely used in most coupling schemes, so it is possible that a bug affecting only the async + reverse mode could have gone undetected for a while. In any case, given your feedback, this is what I'll proofread first. And glad to know you have a workaround so are not stuck in your other work.

I'll keep you informed.

Best regards,

Yvan
kenneth
Posts: 11
Joined: Sun Jul 18, 2021 9:23 pm

Re: Issue with PLE coupling between Code Saturne and Syrthes

Post by kenneth »

Hello Yvan and all,

I have another issue with the PLE coupling between Code_Saturne and Syrthes. The issue arises for some of my larger tests (increased domain size, different rod bundle configuration and a larger number of points to be located). For the smaller tests, it works fine, and I do not get such problems.

The issue I have stems from the ple_locator_extend_search function, which leads to the warning fluid mesh elements not located on solid mesh. Subsequently, the search then extends to infinity. I have pasted the last few lines printed in the listing file below. Checking the location of solid elements in the fluid domain against a known field, I find that the location is off in certain parts of the fluid domain.

Extending search with tolerance factor 170141183460469231731687303715884105728.000000... [failed]
2955 fluid mesh elements not located on solid mesh

Extending search with tolerance factor inf... [ok]

I have checked the coordinates of the missed local fluid cells using ple_locator_get_exterior_list, and checked the coordinates of the received distant points from syrthes using ple_locator_get_dist_coords. Both sets of coordinates used at the initialization of the locator seem fine to me, and the set of points indeed overlap.

My current assumption is there is likely there is some sort of optimization for large datasets (i.e. when the set of points to be located is large), hence why this happens for my larger cases and not the smaller ones. Presently, I am thinking of checking the low-level functions called by ple_locator_extend_search. However, a second opinion would be useful, in case there is something obvious I have missed. I have attached my cs_user_coupling.c to show how the coupling is defined.

Thank you for any help or advice you can give.

Kind regards,

Kenneth
Attachments
cs_user_coupling.c
(6.03 KiB) Downloaded 100 times
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Issue with PLE coupling between Code Saturne and Syrthes

Post by Yvan Fournier »

Hello Kenneth,

Did you try the fix on the master branch for your previous asynchronous mode issue ?

Regarding the new issue, this seems completely different. Do you have a "run_solver.log" listing the successive tolerance extensions and number of unlocated elements ?

I suspect some mesh quality issues could led to location failing even with a sufficient tolerance, on which case you could go to infinity, but a log may help confirm this.

If you can reproduce this without your user subroutines (probable, as your changes seem.to concern interpolation/exchange more than location) and have a small mesh of mesh portion on which this issue can be reproduced, that would be useful to me for checking/debugging.

Best regards,

Yvan
kenneth
Posts: 11
Joined: Sun Jul 18, 2021 9:23 pm

Re: Issue with PLE coupling between Code Saturne and Syrthes

Post by kenneth »

Hello Yvan,

Thank you for the prompt response and your help. I have responded to your queries below.

(1) I have not yet tried the fix for the asynchronous mode on the master branch. I will do so in due course. For the last few tests, I had stuck with v 7.0.0 and synchronous mode.

(2 & 3) I have been able to reproduce the issue with an unmodified version of syrthes and no user subroutines for Code_Saturne (except for cs_user_coupling.c used to define the coupling). The run_solver.log for this test and case setup can be found in the following dropbox link (tarball is about 34 mb, and please let me know of any issues retrieving the case).

dropbox link: https://www.dropbox.com/s/fy2a6qsjwzph6 ... ar.gz?dl=0


Many thanks and kind regards,

Kenneth
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Issue with PLE coupling between Code Saturne and Syrthes

Post by Yvan Fournier »

Hello Kenneth,

I took a look at the case, and it seems it is the volume coupling which is causing location issues.

I recommend deactivating the "allow_nonmatching" option for this case, and increase the tolerance if needed.
This way, you will get postprocessing output, including coupled cells with their matching volumes, an a cloud point corresponding to the cell centers selected on the Syrthes side.

Doing this, it seems there are cells on the fluid side that do not have matching Syrthes cells. The "allow_nonmatching option should handle that in theory by progressively improving the tolerance, but doing this manually (in one step, with a higher initial tolerance) will allow at least some measure of analysis through postprocessing output.

Did you try this yet ? If the geometry settings are those you want, we may have a bug or robustness issue and need to look more in detail, but if this is ca case configuration issue, this may help get the case running more quickly.

Best regards,

Yvan
Post Reply