Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in MQTTAsync_connect() call when MQTT trying to connect initially #1539

Open
manojkotapati opened this issue Nov 19, 2024 · 11 comments
Open

Comments

@manojkotapati
Copy link

manojkotapati commented Nov 19, 2024

Describe the bug
When MQTT trying to connect for the first time after few connect tries, its crashed due to double free of memory in MQTTAsync_connect() call in MQTTAsync.c File

its crashing while trying to free private key

" if (m->c->sslopts->privateKey)
free((void*)m->c->sslopts->privateKey);
"
Issue is not reproducible Every time it Occurs once in a while

To Reproduce
Take a library trace as outlined in the README, and/or have a program or describe the steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Log files Line number may be changed to due to our local patch
#3 0x0000007f924b555c in malloc_printerr (str=str@entry=0x7f9256a798 "free(): double free detected in tcache 2") at malloc.c:5389
No locals.
#4 0x0000007f924b74e0 in _int_free (av=0x7f925aba28 <main_arena>, p=0x55c8d3ef90, have_lock=0) at malloc.c:4232
tmp =
e = 0x55c8d3efa0
tc_idx = 1
size = 48
fb =
nextchunk =
nextsize =
nextinuse =
prevsize =
bck =
fwd =
PRETTY_FUNCTION = "_int_free"
#5 0x0000007f929af840 in MQTTAsync_connect (handle=0x55c8db1e20, options=options@entry=0x55c8d418d8) at /usr/src/debug/paho-mqtt-c/1.3.13-r0/git/src/MQTTAsync.c:755
m = 0x55c8db1e20
rc = 0
conn =
thread_id =
locked =
#6 0x0000007f9359c7c8 in mqtt::async_client::connect (this=this@entry=0x55c8d417a0, opts=..., userContext=userContext@entry=0x0, cb=...) at /usr/src/debug/paho-mqtt-cpp/1.3.2-r0/git/src/async_client.cpp:486
tmpTok = Python Exception <class 'gdb.error'> value has been optimized out:

** Environment (please complete the following information):**

  • OS: [e.g. Linux]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@x152152
Copy link

x152152 commented Nov 20, 2024

What does few connect tries mean? There are some operations to release and initialize SSL resources in MQTTAsync_connect. If you need to reconnect, I think you should use MQTTAsync_reconnect.

@manojkotapati
Copy link
Author

few connect tries mean, there was no internet connection MQTT will try to connect for the first time which will be initial connection only and when the internet is available at that this issue is occurring and MQTTAsync_connect being used for this scenario, If connection happened and then disconnect then we can use MQTTAsync_reconnect

@x152152
Copy link

x152152 commented Nov 20, 2024

I've encountered a similar problem before. I modified the code related to initializing SSL: only initializing the SSL resource once and removing the free operation. It may not be the best solution, but the test results are good so far.

@manojkotapati
Copy link
Author

it would be helpful if you can share the code patch or code snippet of the modified code to resolve this issue

@x152152
Copy link

x152152 commented Nov 21, 2024

// I commented out this part
/*  
if (m->c->sslopts)
{
	if (m->c->sslopts->trustStore)
		free((void*)m->c->sslopts->trustStore);
	if (m->c->sslopts->keyStore)
		free((void*)m->c->sslopts->keyStore);
	if (m->c->sslopts->privateKey)
		free((void*)m->c->sslopts->privateKey);
	if (m->c->sslopts->privateKeyPassword)
		free((void*)m->c->sslopts->privateKeyPassword);
	if (m->c->sslopts->enabledCipherSuites)
		free((void*)m->c->sslopts->enabledCipherSuites);
	if (m->c->sslopts->struct_version >= 2)
	{
		if (m->c->sslopts->CApath)
			free((void*)m->c->sslopts->CApath);
	}
	free((void*)m->c->sslopts);
	m->c->sslopts = NULL;
}
*/
if (options->struct_version != 0 && options->ssl 
            && (ssl_opt_init == false || m->sslopts == NULL)) // add ssl_opt_init status
{
	if ((m->c->sslopts = malloc(sizeof(MQTTClient_SSLOptions))) == NULL)
             //...
    }
    ssl_opt_init = true;

@icraggs icraggs added this to the 1.3.14 milestone Nov 28, 2024
@icraggs
Copy link
Contributor

icraggs commented Nov 28, 2024

MQTTAsync_reconnect is intended to be used when using automatic reconnect. I'll take a look at this.

@icraggs
Copy link
Contributor

icraggs commented Nov 28, 2024

It looks like if you set an SSL option like private key on the first connect and then don't on a later connect it could cause this. I'll fix this up. Does this sound like your situation?

@icraggs
Copy link
Contributor

icraggs commented Nov 28, 2024

Actually looking again, I'm not sure how it can.

A client version and a client library trace along with a more detailed scenario could help.

@manojkotapati
Copy link
Author

Its very rarely reproducible and i have shared whatever logs i have in the initial coredump, MQTT Paho client version is PAHO MQTT Version: 1.3.13 in which we faced this issue.

@icraggs
Copy link
Contributor

icraggs commented Dec 16, 2024

What about some more details of your scenario?

  • Are you using automatic reconnect? Or the connectiionLost callback and a call to connect?
  • Or is the application trying to connect for the first time several times? (Now I see that you are probably doing this as the title of the issue includes "initially")
  • Do you always wait for the onFailure callback before calling connect again?
  • Are the SSL options the same on each connect call or do you vary them?
  • Are there multiple threads involved?

@icraggs icraggs modified the milestones: 1.3.14, 1.3.15 Dec 30, 2024
@manojkotapati
Copy link
Author

@icraggs

Are you using automatic reconnect? Or the connectiionLost callback and a call to connect?

listing on on_failure(const mqtt::token& tkn) callback and if its failed trying again to connect for the first time several times.

Do you always wait for the onFailure callback before calling connect again?

Yes

Are the SSL options the same on each connect call or do you vary them ?

SSL options are the same on each connect call

Are there multiple threads involved?

yes multiple threads are involved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants