Software
R packages
Saha, Arkajyoti, Daniela Witten and Jacob Bien. "independencepvalue: Testing independence between groups of Gaussian variables" R package 0.0.2. (2022). Website
Description: independencepvalue is an R package that tests the independence between two groups of Gaussian variables, where the groups were obtained by thresholding the correlation matrix, detailed in Saha, Witten and Bien (2022). When we generate a hypothesis based on thresholding the correlation matrix and then use the same data set to test the hypothesis, classical hypothesis testing may lead to invalid results. independencepvalue accounts for the fact that the hypothesis was selected from the data and addresses the problem by considering p-values conditional on the selection event .
Saha, Arkajyoti, Sumanta Basu and Abhirup Datta. "RandomForestsGLS: Random Forests for dependent data" R package 0.1.4. (2020). CRAN Vignette
Description: RandomForestsGLS is a package for fitting non-linear regression models on dependent data (spatial and temporal) with Generalised Least Square (GLS) based Random Forest (RF-GLS) detailed in Saha, Basu and Datta (2021). For spatial data, RandomForestsGLS combines the strengths of Random Forest and Gaussian Process to estimate and predict non-linear functions using nearest neighbor Gaussian Process. For time-series data, RandomForestsGLS uses the AR (auto-regressive) process covariance structure with Random Forests for estimation. To the best of our knowledge, RandomForestsGLS is the first package that uses Random Forest for estimation and prediction in a non-linear regression setup under correlated errors. This package is in beta stage of development. 14000+ CRAN Downloads as of Feb, 2023.
Saha, Arkajyoti, and Abhirup Datta. "BRISC: bootstrap for rapid inference on spatial covariances." R package 1.0.5. (2018). CRAN
Description: BRISC is a package for rapid estimation, prediction and inference for large spatial data in a frequentist setup. BRISC estimation and prediction relies on nearest neighbor approximations of the spatial Gaussian Process likelihood, and uses a scalable parametric bootstrap to provide inference for all spatial parameters. BRISC provides confidence intervals in a frequentist setup for all parameters including the spatial variance and range of Gaussian Process. Inference from BRISC is highly competitive with those obtained on Bayesian approaches relying on MCMC, while being manifold times faster. 31000+ CRAN Downloads as of Feb, 2023.