Intro
This page is about the CAS Open Source Software Committee instead of being about open source software itself. Discussion about the committee takes place in the Committee Forum.
See also the committee page on netforum.casact.org.
Current Committee projects
Here are four projects the committee is currently working on:
1. Blog
The blog is comprised of relatively short, periodic posts from various committee members about some aspect of open source software that they find potentially useful for actuarial work. If you've solved some actuarial problem with open source software, considering writing a blog post about it to save other actuaries time.
Blog posts can be about beginner or advanced topics. They don't have to contain original research and are less formal than a research paper. To ensure high quality, blog posts will be reviewed lightly by a couple other members of the committee. Here is an outline of the review process used for the first couple blogs:
- Authors should post a draft of their blog in the draft blogs page for discussion.
- At least two reviewers sign up to review the blog (editing this page in the upcoming schedule section below).
- The reviewers review the blog post. This may take a couple weeks. Issues to consider are:
- Does the blog post help people solve actuarial (or actuarially-related) problems? Is it clear from the title and introduction what the blog is about and what problem it addresses?
- Who is the post's intended audience? Will the blog post be clear to these people? (For instance, if the post is targeted at R beginners, is it only intelligible to R experts?)
- Does all the code work without modification? Does it depend on any special installation?
- Is the blog post (including any code) well-formatted? Does it use pictures (if any) effectively? Do any sentences need editing?
- To avoid groupthink, the reviewers send their feedback to the author and the the blog committee head, not to each other. When all the reviewers have sent their feedback, then the feedback is forwarded to all the reviewers.
- The author and the reviewers schedule a call and discuss the post. The point of the call is to get the author and reviewers on the same page about what editing is necessary or suggested, and whether the topic and content of the blog post are appropriate for the open source blog.
- Assuming some editing will be done, the author makes the necessary changes, and informs the reviewers. The reviewers reread the post, and may want another call to discuss.
- When the reviewers and author agree, the blog post is ready for publication.
Here is an upcoming schedule of blog posts being edited / reviewed:
- Date TBA
- Topic: dplyr, a useful R package for data manipulation
- Author: Ben Escoto
- Reviewers: Brian Fannin; Michael Cao
- Date TBA
- Topic: Modeling ALAE Using Copulas
- Author: Greg McNulty
- Reviewers: Ben Escoto, Dan Murphy
- Date TBA
- Topic: Parallel processing in R (details TBA)
- Author: Stephen Lienhard
- Reviewers:
- Date TBA
- Topic: KML Files from R
- Author: Simon Tam
- Reviewers: Avraham Adler, Steve Berman
- Date TBA
- Topic: R and Actuarial Disclosure Requirements
- Author: Rajesh Saharabuddhe
- Reviewers:
- Date TBA
- Topic: Why Python for Data Analysis
- Author: Steve Yun
- Reviewers:
2. Webinars, Education and Presentations
WEP project lead: Brian Fannin
This area will address those areas of educational outreach other than blogs, i.e. anything with a live audience. This encompasses webinars, workshops, seminars and presentations.
Webinars:
- November 5, 2014
- Topic: Loss Simulation
- Presenters: Robert Bear, Joseph Marker and Hai You
Potential webinar topics for 2015:
- Loss reserving in R
- Integration of R and Excel
- Fitting loss distributions in R and Excel
- Reproducible analysis/literate programming with R
Workshops:
- RPM 2015
- March 9, 2015
- Dallas, TX
- Instructor: Brian Fannin
- CLRS 2015
- September 9-11, 2015
- Atlanta, GA
- Who?
- Others?
- Standard slide deck on Github?
Limited Attendance Seminars:
- When?
- Where?
- Who?
Presentations
This could mirror some of the webinar topics, though with an opportunity for a deeper dive. Below are some thoughts.
- Bayesian analysis with R
- Claim-level loss reserving with R
- Overview of Python: pandas, scikit-learn
3. Virtual machine / Test platform
Currently the Committee owns a publicly-accessible virtual machine. See this thread for discussion (and password) and this page for how the image was created. Actuaries can point their web browser at this machine and experience R and a number of actuarial packages without worring about installing R on their local machines. If they find it useful, they can make their own copy of our image and have their own fully functioning R server in minutes.
However, there are a number of improvements that could be made:
- Currently it's just running on Amazon's limited time "Free Tier". If this service is to remain running, we need to move it to paid hosting, presumably funded by the CAS.
- Currently only one person can use the service at one time. If a second logs in, it logs the first person out and the second person is staring at the screen of the first. Hopefully this situation could be improved.
- Instructions are here http://opensourcesoftware.casact.org/cloud. However, the image needs to be maintained when new versions of R/linux come out.
- The image could also be expanded to include more R packages or non-R open-source software.
Virtual Machine volunteers:
- <name> - <what I'm working on>
4. Wiki and Forums
Committee members can help develop the Wiki and forums on this site to be a useful resource for actuaries learning about Open Source Software.
Committee members (and outsiders) can exchange information about Open Source on the Open Source forum. Posts about the committee itself generally go in the Committee Business forum. Unless there are confidentiality issues, please use the Committee Business forum instead of email for discussing committee projects.
The wiki has great potential but it requires streamlining and keeping it updated requires regular attention. A small number of up-to-date, high quality pages are probably much more valuable than a large number of pages of unknown credibility. For instance, it may be valuable to maintain this information on our wiki:
- Brief overview of how open source is currently being applied to solve actuarial problems.
- Volunteer(s):
- Up-to-date instructions on installing and maintaining R for typical insurance use.
- Volunteer(s):
- Pointers to Open Source resources specifically geared to actuarial applications.
- Volunteer(s):