Ira Sharenow Consulting

Ira Sharenow ConsultingIra Sharenow ConsultingIra Sharenow Consulting
  • Home
  • R Stat
  • SQL
  • Tableau
  • Excel
  • About Ira
  • Contact
  • More
    • Home
    • R Stat
    • SQL
    • Tableau
    • Excel
    • About Ira
    • Contact

Ira Sharenow Consulting

Ira Sharenow ConsultingIra Sharenow ConsultingIra Sharenow Consulting
  • Home
  • R Stat
  • SQL
  • Tableau
  • Excel
  • About Ira
  • Contact

R and Statistics


Experienced R Programmer

For more than 10 years, I have been using R for data analysis, data manipulation, and graphing. To demonstrate the power of R, I have created a number of work samples. Please click on the links to read the papers. 


Analyzing Madison Home Sales with R

  

I did an initial analysis of single family home sales in the Madison, Wisconsin area. Techniques used were regression, random forests, and boosting. I used the tidyverse for data manipulation and the ggplot2 package to create charts. The boxplot of home prices by Madison school and by decade used faceting. I used the flextable package to create the table.

I performed a repeat sales analysis using the R slider package.


  

Madison Housing with code

Madison Housing without code

Madison Housing Repeat Sales Analysis


Find out more

Trea turner VS Manny Machado

I am a Dodgers baseball fan, so I decided to have some fun and compare Trea Turner to Manny Machado as measured by batting average over the past five years.


# The data

infielders = data.frame(players = c('Turner', 'Machado'),

Y2018 = c(.271, .273),
 Y2019 = c(.298, .256 ),
 Y2020 = c(.335, .304 ),
 Y2021 = c(.328, .278),
 Y2022 = c(.298, .298)
)

# reshaping the data

inf =
 infielders |>
 pivot_longer(cols = !players,
 names_to = 'years', names_prefix = 'Y', names_transform = as.integer,
 values_to = 'BA')

ggplot(inf, aes(x = years, y = BA, color = players)) +
 geom_line(linewidth = 1.3) + geom_point() +
 geom_point(data = highlight_df, aes(x = years, y = BA), color = "blue", size = 3.5) +
 labs(title = "Trea Turner versus Manny Machado", subtitle = "Annual Batting Average 2018-2022") +
 theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5)) +
 xlab("Years") +
 ylab("Batting Average") +
 scale_y_continuous(
 labels = scales::number_format(accuracy = 0.001)) +
 annotate("text", x = 2021, y = .335, label = "2021: Trea Turner is MLB batting champ")


  

Trea Turner versus Manny Machado with code

Trea Turner versus Manny Machado without code


El Cerrito finances

 As an El Cerrito resident, I analyzed a lot of El Cerrito data in R (and Tableau). In El Cerrito Data in Pictures, I used the R package ggplot in order to make it easy for people to see how El Cerrito’s finances were doing.


El Cerrito Finances With Code

El Cerrito Finances Without Code


Statistical Learning Techniques

A few years ago, I took the online course by Hastie and Tibshirani based on their book Introduction to Statistical Learning. Below is some code that is based on what I learned.


# Plot of residuals versus fitted
plot(fitted(HS09.lm1AVGED), residuals(HS09.lm1AVGED), xlab = "Fitted", ylab = "Residuals")
abline(h = 0, lwd = 2)


Statistical Learning with R: Using the API Dataset and a Sim





The Performance of Albany High School on the API: A Statistical Regression Analysis

Before I moved to El Cerrito, I lived in Albany. I decided to see how well Albany High School students were performing on certain standardized tests. I downloaded the data from the California Department of Education website and then I performed a variety of regression analyses.


The Performance of Albany High School on the API: A Statistical Regression Analysis


  

Copyright © 2022 IRASHARENOW.COM - All Rights Reserved.


This website uses cookies.

We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.

DeclineAccept