‘precision_at_target_recall’, ‘recall_at_target_precision’ on hyper parameters on AWS SageMaker , how does it train with that constraint?

Question

‘precision_at_target_recall’, ‘recall_at_target_precision’ on hyper parameters on AWS SageMaker , how does it train with that constraint?

209 views Asked by Hamed Niakan At 05 October 2020 at 14:09

[AWS SageMaker LinearLearner][1] ; binary_classifier_model_selection_criteria is a hyper parameter useful without having cross validation and hyper param tuning. At least it seems to me so.

If yes , can you please explain how a model can be trained with having that hyper parameter be set to ‘precision_at_target_recall’ or ‘recall_at_target_precision' ?

I have not seen such a thing in scikit-learn. The only way it seems possible and reasonable to me is to play with threshold to keep it at the level of target_recall or target_precision, unfortunately nothing is mentioned about the threshold or cut off in documentation and Ii guess it is still at .5.

Original Q&A

There are 1 answers

**Olivier Cruchant** · Answer 1 · 2020-10-06T08:54:49+00:00

SageMaker Linear Learner is different from open-source linear models in multiple aspects, in particular:

More control on performance: It features hyperparameters to force the model to respect a given precision or recall threshold. For example, if your goal is to get a classifier that gets at least 95% recall, you can choose hyperparameters accordingly: binary_classifier_model_selection_criteria='precision_at_target_recall' and target_recall=0.95
It can explore multiple configurations in a single training job, effectively doing "tuning-in-training". This behavior is called "parallel training and model selection" in the documentation, is different than the SageMaker Model Tuning feature and is specific to the Linear Learner. You can control the number of tested models with the num_models parameter
It is usually much more scalable than alternatives: it is trained like a neural network in mini-batch SGD fashion, allowing it to support data-parallel distributed training over multiple CPUs or GPUs instances. It supports PIPE-mode data loading, a SageMaker low-latency, high-throughput data ingest technique that uses Unix named pipe to stream data from S3 directly in memory, allowing to learn on immense datasets too big for local disks.
It was designed to be highly efficient we can only speculate what the underlying tricks are, since the implementation is not public, but I believe the rich SGD customization options and the tuning-in-training design contribute to its economic performance. See below a benchmark vs MLlib from the paper Elastic Machine Learning Algorithms in Amazon SageMaker by Liberty et al
It comes with a built-in serving stack: you can easily deploy a trained model to a SageMaker-managed serving endpoint, without having to write any dockerfile or webserver code yourself

An additional notable difference of the SageMaker Linear Learner compared to the other SageMaker Built-in is that you can read it out of SageMaker with the MXNet deserialization code provided in the doc

TechQA.

‘precision_at_target_recall’, ‘recall_at_target_precision’ on hyper parameters on AWS SageMaker , how does it train with that constraint?

There are 1 answers

Related Questions in AMAZON-WEB-SERVICES

Related Questions in SCIKIT-LEARN

Related Questions in LINEAR-REGRESSION

Related Questions in AMAZON-SAGEMAKER

Related Questions in HYPERPARAMETERS

Popular Questions

Popular Tags

Trending Questions