Re: [MSNoise] advice on processing database subsets

2 May 2016

Hi again,

Thanks for the comments. Here's what I did to set just a singe day for 
processing, so that I can test the parameter settings. I looked into the 
API code and needed to import from msnoise_table_def.py, but it seems to 
work OK:

from msnoise.api import connect
from msnoise_table_def import Job

set_day = '2013-10-14'
jobtype = 'CC'
session = connect()
jobs_set   = session.query(Job).filter(Job.jobtype == 
jobtype).filter(Job.day == set_day)
jobs_set.update({Job.flag: 'T'})
jobs_unset = session.query(Job).filter(Job.jobtype == 
jobtype).filter(Job.day != set_day)
jobs_unset.update({Job.flag: 'D'})
session.commit()

So now I have a jobs table with just the day I want set to 'T'. I hoped 
I was ready to try 'msnoise compute_cc', but it seems to want me to set 
Filters first. This appears to be referring to the MCWS filter 
parameters? I am a little surprised since I thought MCWS would come 
later, and don't understand how the CC computation would be dependent on 
the MCWS filter parameters.

To tell you the truth, at the moment I am more interested in using the 
msnoise cross-correlations as input to a tomography algorithm, rather 
than in MCWS itself. In any case I am keen to look at the CC to see that 
they make sense, before i move to anything else.

Would it be possible to please advise on whether there is a way to do 
compute_cc without having to worry about the MCWS parameters?

Thanks,

- Phil

Thomas Lecocq wrote:
...
  Hi guys,

 Yeah, I have been thinking about a "benchmark" mode for quite a number 
 of weeks, i.e. since I tested a first run of PWS in order to compare 
 the final dv/v ; to compare properly I have to test quite a number of 
 parameters.

 My current idea is to run a set of possible parameters, for different 
 steps. This would lead to a large number of branches in a large tree, 
 but it would definitively be quite interesting.

 I am really not in favor of duplicating the database, rather to 
 create  a "config" file with an caller script, to set/change/ 
 parameters... Theoretically, the API should let you do all the 
 actions. The only thing that would be a little trickier is to 
 store/reuse the results of each step in order to compare them. For 
 info, using the "shutil" module you can move/copy files easily.

 Let's keep brainstorming on that and see how it goes !

 Cheers

 Thomas

 On 01/05/2016 16:52, Lukas Preiswerk wrote:
  Hi all

 I was in a similar situation as Phil, and I used (1). It’s not
 straightforward to copy the database and make msnoise work again in a 
 new
 directory. But it’s definitely possible.
 I actually think it would be a nice addition to msnoise to not only 
 make an
 option for multiple filters, but also for multiple other parameters 
 (window
 lengths, overlaps, windsorizing, etc.). This would really help in the 
 first
 “exploratory phase” to find out what is the best way to process your
 dataset.
 What do you think of this idea? Practically I would implement it by 
 moving
 these parameters (window length etc.) to the filter parameters, and 
 treat
 it in the same way as an additional filter. As far as I understand the
 code, this wouldn’t require many adaptions…

 Lukas

 2016-05-01 11:35 GMT+02:00 Thomas Lecocq &lt;Thomas.Lecocq(a)seismology.be&gt;be>:

  Hi Phil,

 I'd say (3) would be better indeed. You can script msnoise using the 
 api.
 If you need to change params in the config, you can alternatively 
 use the
 "msnoise config --set name=value" command.

 Please keep me updated of your progresses & tests !

 Thomas

 On 01/05/2016 10:34, Phil Cummins wrote:

  Hi again,

 As some of you may recall, I'm just getting started with msnoise. I 
 have
 a large database and have managed to get my station and data 
 availability
 tables populated.
 At this point, rather than running through the whole database, 
 processing
 it with parameters I hope might work, I'd rather process small 
 subsets,
 e.g. 1 day at a time, to experiment with window lengths, overlaps, 
 etc., to
 find what seems optimal. My question is, what's the best way to 
 process
 subsets of my database?
 It seems to me I have several options:
      (1) Make separate databases for each subset I want to test, 
 and run
 through the workflow on each
      (2) Set start and end times appropriate for my subset, re-scan 
 and
 run through the workflow.
      (3) Populate the jobs table, and write a script to activate 
 only the
 jobs I want and not the others.
 I want to a fair bit of testing using different parameters before I 
 run
 through the whole thing, so I think (3) may be best. But any advice 
 would
 be appreciated.
 Regards,

 - Phil
 _______________________________________________
 MSNoise mailing list
 MSNoise(a)mailman-as.oma.be
 http://mailman-as.oma.be/mailman/listinfo/msnoise
  _______________________________________________
 MSNoise mailing list
 MSNoise(a)mailman-as.oma.be
 http://mailman-as.oma.be/mailman/listinfo/msnoise
  _______________________________________________
 MSNoise mailing list
 MSNoise(a)mailman-as.oma.be
 http://mailman-as.oma.be/mailman/listinfo/msnoise 
 _______________________________________________
 MSNoise mailing list
 MSNoise(a)mailman-as.oma.be
 http://mailman-as.oma.be/mailman/listinfo/msnoise 

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [MSNoise] advice on processing database subsets