It ran for less than a day. Approximately ~12k MWCS jobs, but not all of
those ran because of the REF file issue. I reran it last night, and it
ended just in the last hour - and while it did compute all the jobs, it
gave me the same error at the end. So it seems like an exit error once the
compute_mwcs job is done (no exit code, and then SQL hangs?).
Of course this is could all be complicated since I am using a Slurm job
scheduler to hand processor assignment.
On Fri, Oct 19, 2018 at 11:01 AM Thomas Lecocq <Thomas.Lecocq(a)seismology.be>
wrote:
Ashton,
No, I don't think it's linked. If the REF file is not available, the
code should crash and not hang.
How long did your MWCS ran ? How many MWCS jobs are there ? How many
stations / stations-pairs ?
what is the content of your my.cnf / or mysql configuration file ?
Thomas
Le 19/10/2018 à 18:53, Flinders, Ashton a écrit :
Hi Thomas, I actually think this was related to
the PR I submitted the
other day. Since I have a mix of stations (some 3-comp some only Z), when
mwcs_compute tried to calculate RR for a station-pair that only had ZZ,
and
it couldnt find the reference function it
crashed/hanged. Then after a
while hanging it threw the SQL error.
On Thu, Oct 18, 2018 at 11:15 PM Thomas Lecocq <
Thomas.Lecocq(a)seismology.be>
wrote:
> Hi Ashton
>
> it seems your MWCS computation took a looooooong time and the MySQL
> connection was killed during that time. Can you confirm ?
>
> Thomas
>
>
> Le 18/10/2018 à 18:54, Flinders, Ashton a écrit :
>> I get a strange crash part way through my MWCS step (see below), and
>> compute_MWCS is not finishing. E.g. I have 5 frequency bands, but for
> bands
>> 2-4 only 1 of 10 station pair MWCS's get calculated, even though all
the
>> data is there in the stacks. I have tried
rerunning comute_mwcs by
> changing
>> the flag back to 'T' for the station pairs where mwcs did not get
>> calculated, but it still crashes. This crash is repeatable.
>>
>> Any thoughts?
>>
>> (p.s. I also initially tried remaking the stacks, but it crashed at the
>> same point. The data looks good in the stacks)
>>
>> -ashton
>>
>> During handling of the above exception, another exception occurred:
>>
>>
>> Traceback (most recent call last):
>>
>> File
>>
>
"/home/ashton/.local/lib/python3.5/site-packages/sqlalchemy/engine/base.py",
>> line 1139, in _execute_context
>>
>> context)
>>
>> File
>>
>
"/home/ashton/.local/lib/python3.5/site-packages/sqlalchemy/engine/default.py",
>> line 450, in do_execute
>>
>> cursor.execute(statement, parameters)
>>
>> File
>>
>
"/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/cursors.py",
>> line 165, in execute
>>
>> result = self._query(query)
>>
>> File
>>
>
"/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/cursors.py",
>> line 321, in _query
>>
>> conn.query(q)
>>
>> File
>>
>
"/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/connections.py",
>> line 859, in query
>>
>> self._execute_command(COMMAND.COM_QUERY, sql)
>>
>> File
>>
>
"/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/connections.py",
>> line 1096, in _execute_command
>>
>> self._write_bytes(packet)
>>
>> File
>>
>
"/home/ashton/anaconda3/envs/msnoise/lib/python3.5/site-packages/pymysql/connections.py",
line 1048, in _write_bytes
"MySQL server has gone away (%r)" % (e,))
pymysql.err.OperationalError: (2006, "MySQL server has gone away
(BrokenPipeError(32, 'Broken pipe'))")
The above exception was the direct cause of the following exception:
_______________________________________________
MSNoise mailing list
MSNoise(a)mailman-as.oma.be
http://mailman-as.oma.be/mailman/listinfo/msnoise
_______________________________________________
MSNoise mailing list
MSNoise(a)mailman-as.oma.be
http://mailman-as.oma.be/mailman/listinfo/msnoise
--
Ashton F. Flinders, Ph.D
U.S. Geological Survey
345 Middlefield Road
Menlo Park, CA 94025
(650) 329-5050