I've been reading in the litterature that quasi newton methods such as DFP and BFGS have poor performance on ill-conditionned problems, but I don't understand the reason of it. I have been trying to use these methods on a quadratic problem that is ill-conditionned and it doesn't converge in p+1 iterations (it is one of quasi newton methods properties for quadratic problems), but a little more. Why is that ? Thank you for your help.
Why do quasi newton methods such as DFP and BFGS have poor performance on ill-conditionned problem, even if quadratic
1k views Asked by Kathryn Schutte At
1
There are 1 answers
Related Questions in OPTIMIZATION
- Optimize LCP ReactJs
- Efficiently processing many small elements of a collection concurrently in Java
- How to convert the size of the HTML document from 68 Kb to the average of 33 Kb?
- Optimizing Memory-Bound Loop with Indirect Prefetching
- Google or-tools soft constraint issue
- How to find function G(x), and make for every x, G(x) always returns fixed point for another function F(G(x))
- Trying to sort a set of words with the information theory to solve Worlde in Python but my program is way to slow
- Do conditional checks cause bottlenecks in Javascript?
- Hourly and annual optimization problem over matrix
- Sending asynchronous requests without a pre-defined task list
- DBT - Using SELECT * in the staging layer
- Using `static` on a AVX2 counter function increases performance ~10x in MT environment without any change in Compiler optimizations
- Is this a GCC optimiser bug or a feature?
- Performance difference between two JavaScript code snippets for comparing arrays of strings
- Distribute a list of positive numbers into a desired number of sets, aiming to have sums as close as possible between them
Related Questions in DATA-SCIENCE
- KEDRO - How to specify an arbitrary binary file in catalog.yml?
- Struggling to set up a sparse matrix problem to complete data analysis
- How do I remove slashes and copy the values into many other rows in pandas?
- Downloading full records from Entrez
- Error While calling "from haystack.document_stores import ElasticsearchDocumentStore"
- How to plot time series from 2 columns (Date and Value) by Python google colab?
- How to separate Hijri (Arabic) and Gregorian date ranges from on column to separate columns
- How to wait the fully download of a file with selenium(firefox) in python
- Survey that collects anonymous results, but tracks which recipient have responded
- Dataframe isin function Buffer was wrong number of dimensions error
- How to add different colours in an Altair grouped bar chart in python?
- Python Sorting list of dictionaries with nested list
- Float Division by Zero Error with Function Telling Greatest Power of a Number Dividing Another Number
- If a row contains at least two not NaN values, split the row into two separate ones
- DATA_SOURCE_NOT_FOUND Failed to find data source: mlflow-experiment. Please find packages at `https://spark.apache.org/third-party-projects.html
Related Questions in QUADRATIC
- Linearlization of quadratic constraint
- Find the first three terms in a geometric series using the given sum and product
- How to convert vector of values to concave down parabolic curve
- regex_search returns an error when using user input
- I have stored the values of f(x) in an array, but how do i print the value of x of a certain element f(x) from the array in C?
- How to use plot_model to represent the coefficient of an interaction between a quadratic variable and a dummy in R?
- Find the control point of a quadratic curve
- cvxpy.error.DCPError: Problem does not follow DCP rules: Error is probably due to the quardratic constraint , but how to solve this problem?
- Quadratic equation in Python
- Change Quadratic Objective Function in R Optimization (ROI Package)
- Score models based on log, quadratic, or linear fit on a 0-1 scale
- Getting Bezier Handles from a Curve's control points
- How do I ask the user if they want to solve the equation again with a yes or no answer in Python?
- Draw separate linear and quadratic regression graphs for each group in the same panel
- How to conceptually interpret output of a polynomial (quadratic) regression, when the regressors are orthogonal/correlated
Related Questions in HESSIAN
- Use pytorch to compute hessian of network, and the value is quite different from the theory
- How to preallocate using JacobianConfig for "Hessian of vector valued function" double Jacobian in Julia ForwardDiff package
- Hessian (Java) deserialization of class which is not in the classpath does not raise an error
- Java upgrade with Caucho Hessian v3/v4
- Add in Spring boot application a servlet that is not extending jakarta.servlet
- Errror: $ not suitable to atomic vectors
- Fast way to calculate Hessian matrix of model parameters in PyTorch
- A faster Hessian vector product in PyTorch
- how do i figure out what my exact objective gradient and hessian for a scipy optimize problem?
- @JsonProperty alternative in Spring Hessian
- Calculating the Hessian Vector Product of a Flax NN output wrt to the inputs
- How to use Optim.jl optimize function with a summation
- Compute Hessian of lossfunction in Tensorflow
- Computing the Hessian of a Simple NN in PyTorch wrt to Parameters
- Hessians for Spearman Rank Correlation
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Ill-conditioning is a general problem for first-order optimization algorithms. Basically there are two main aspects with ill-conditioning:
Standard Newton method, which is a second-order optimization algorithm, in theory, can tackle both of these problems: first, it converges much faster, hence, there is a much lower roundoff error, and second, it does not have problems with the stretched shape since now it takes into account the gradient scaling to move in a proper direction.
However, solving it involves the operation of the inverse of the Hessian, which, in the case of large condition numbers, may result in the corresponding small eigenvalues blowing up, leading to numerical instability when implemented in a naive way. This issue can still be resolved by utilizing a proper direct solver or iterative solver with pre-conditioning
Finally, quasi-Newton methods such as DFP and BFGS try to approximate the inverse Hessian iteratively. They may be more robust in handling roundoff errors and also may be faster in convergence than first-order methods, but because of this approximation, they may end up performing worse than the standard Newton method.