#data-analysis #machine-learning #analysis #eda #string #manipulation

bin+lib simple_ml

Functions required for data analysis and machine learning tasks

23 releases

0.3.2 Jul 12, 2020
0.3.1 Jul 2, 2020
0.3.0 Jun 28, 2020
0.2.11 Jun 27, 2020
0.1.12 May 31, 2020

#403 in Machine learning

MIT license

445KB
7.5K SLoC

Description

  • To make a library of functions that are frequently used for data anlaysis and machine learning tasks
  • Inspired by Python libraires like numpy, sklearn, pandas etc..

Changes in this version

Section Added Fixed Removed
lib_matrix dataframe_comparision, datamap_comparision, vector_comparision, DataFrame > sort, DataMap > sort
lib_ts pacf

Comparision with Scikit learn's output

  • OLS
  • BLR
  • SSVM
  • KNN
  • Kmeans

Vibliography ?

List of Functions and Structs

lib_matrix

1. MatrixF : 
    > determinant_f
        x determinant_2
        x determinant_3plus
    > is_square_matrix
        x round_off_f
    > inverse_f
        x identity_matrix
        x zero_matrix

2. DataFrame:
    > describe
    > groupby
    > sort

3. DataMap:
    > describe
    > groupby
    > sort

1. dot_product
2. element_wise_operation
3. matrix_multiplication
4. pad_with_zero
5. print_a_matrix
6. shape_changer
7. transpose
8. vector_addition
9. make_matrix_float
10. make_vector_float
11. round_off_f
12. unique_values
13. value_counts
14. is_numerical
15. min_max_f
16. type_of
17. element_wise_matrix_operation
18. matrix_vector_product_f
19. split_vector
20. split_vector_at
21. join_matrix
22. make_matrix_string_literal
23. head
24. tail
25. row_to_columns_conversion
26. columns_to_rows_conversion
27. dataframe_comparision
28. datamap_comparision
29. vector_comparision

lib_ml

1. OLS:
    > fit
2. BLR:
    > fit
    > sigmoid
    > log_loss
    > gradient_descent
    > change_in_loss
    > predict
3. KNN
    > fit
    x predict
4. Distance
    > distance_euclidean
    > distance_manhattan
    > distance_cosine
    > distance_chebyshev
5. Kmeans
    > fit
6. SSVM
    > fit
    x sgd
    x compute_cost
    x calculate_cost_gradient
    x predict

1. coefficient
2. convert_and_impute
3. covariance
4. impute_string
5. mean
6. read_csv
7. root_mean_square
8. simple_linear_regression_prediction
9. variance
10. convert_string_categorical 
11. min_max_scaler
12. logistic_function_f
13. log_gradient_f 
14. logistic_predict 
15. randomize_vector
16. randomize
17. train_test_split_vector_f
18. train_test_split_f
19. correlation
20. std_dev
21. spearman_rank
22. how_many_and_where_vector
23. how_many_and_where
24. z_score
25. one_hot_encoding
26. shape
27. rmse
28. mse
29. mae
30. r_square
31. mape
32. drop_column
33. preprocess_train_test_split
34. standardize_vector_f
35. min_max_scaler
36. float_randomize
37. confuse_me
38. cv
39. z_outlier_f
40. percentile_f
41. quartile_f

lib_nn

1. LayerDetails :
    > create_weights
    > create_bias
    > output_of_layer

1. activation_leaky_relu
2. activation_relu
3. activation_sigmoid
4. activation_tanh

lib_string

1. StringToMatch :
    > compare_percentage
        x calculate
    > clean_string
        x char_vector
    > compare_chars
    > compare_position
    > fuzzy_subset
        x n_gram
    > split_alpha_numericals
    > char_count
    > frequent_char
    > char_replace

1. extract_vowels_consonants
2. sentence_case
3. remove_stop_words
4. tokenize

lib_ts

1. acf
2. simple_ma
3. exp_ma
4. best_fit_line

About the author

Dependencies

~1.4–2.3MB
~39K SLoC