March 2023 - Migration - mailman.cs.vt.edu

VT Migration Prediction Update
by Brian Mayer 17 Mar '23

17 Mar '23

Alex, Here is the final biweekly update. I am only including you on the distribution. Feel free to send it broader. Holly will receive a copy with the invoice as requested. Feel free to give me a call next week and we can discuss. Thanks, Brian -- Brian Mayer bmayer(a)cs.vt.edu 540-231-5907 Sanghani Center for Artificial Intelligence and Data Analytics <https://sanghani.cs.vt.edu/> - Virginia Tech <https://vt.edu/>

1 0

Final Report
by Brian Mayer 06 Mar '23

06 Mar '23

Alex, Here is our Final Report and Lessons Learned as discussed during our call on Tuesday. Let me know if there is anything else you'd like us to add. Also as mentioned over the next two weeks we are working to prepare the sample product output and all the elements that are going into it. Happy to set up another call if you'd like and let us know if/when you have dates for an in-person meeting. Thanks, Brian and team -- Brian Mayer bmayer(a)cs.vt.edu 540-231-5907 Sanghani Center for Artificial Intelligence and Data Analytics <https://sanghani.cs.vt.edu/> - Virginia Tech <https://vt.edu/>

2 1

VT Migration Product Output Outline
by Brian Mayer 02 Mar '23

02 Mar '23

Alex, I've attached an outline of what we think the output could include. I've discussed these items with the team and we are confident most of them will come to fruition in the next two weeks (or sooner) but a few are still being tweaked, tested, or completed so we may not be able to include them in the final product for this effort. I have also listed a few options of items on page 3 that could further enhance the report but are items that we definitely won't be able to get to in the next two weeks but could certainly work on if we have more time. One of those is a list of the fastest-growing terms, that I know you seemed to be interested in. I'll be working tonight and tomorrow on converting our slides into a document and attaching this sample output as an Appendix for the final report (which I will send to you tomorrow). Thanks, Brian -- Brian Mayer bmayer(a)cs.vt.edu 540-231-5907 Sanghani Center for Artificial Intelligence and Data Analytics <https://sanghani.cs.vt.edu/> - Virginia Tech <https://vt.edu/>

2 1

Notes from my discussion with Alex
by Brian Mayer 28 Feb '23

28 Feb '23

In addition to any quantitative predictions we provide (total, by sector, and by country) in the current dashboard... Alex would like us to provide what demographics we expect to see an uptick from. (*Bursts*?) - We could look for bursts of keywords by demographic - country - family type - sector - conveyance (????) - We know that this is pretty reliable as a total and by sector, but we may need to validate others. Might be wise to develop a more objective validation methodology - e.g., What defines an uptick/burst in the encounter data (increase by X%?) - we tried using burst analysis but for some reason this didn't work - Then maybe we determine Accuracy/Precision/Recall - We will only be able to evaluate a limited amount of observations (20-30) - SHOULD WE DO BURST ANALYSIS? HOW MUCH? HOW SHOULD WE VALIDATE? Alex liked the fastest-growing keywords from the EMBERS visualization (CAN WE PROVIDE THIS? CAN WE INCLUDE NON-KEYWORDS? RUN DQE?) Alex liked the word cloud in the EMBERS visualization (WHAT COULD/SHOULD WE USE TO CREATE A WORD CLOUD?) Alex liked the idea of "hot" keywords Other aspects that could be included with "Hot" Keywords - Group hot keywords by spaCy entity type, e.g., PERSON, NORP, FAC, ORG, GPE, etc. - Can we include non-keywords *Modeling work still to do in addition to anything determined above* Take another look at the country-specific predictions and try to improve nMIL results Confidence Interval CBP Conveyance/Transportation analysis (*push this to future work*) *Items Due to ODNI* Provide Alex an outline of the product Turn slides into document report Add a description/example of the output product to the final report -- Brian Mayer bmayer(a)cs.vt.edu 540-231-5907 Sanghani Center for Artificial Intelligence and Data Analytics <https://sanghani.cs.vt.edu/> - Virginia Tech <https://vt.edu/>

1 0