Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
halflings
on April 4, 2017
|
parent
|
context
|
favorite
| on:
Why Momentum Works
I assume that multiplying by a given factor shouldn't matter since you still have the learning rate as a factor (which is itself a factor of the gradient). This might just mean that the learning rate should be lower or higher with this method.
im3w1l
on April 5, 2017
|
next
[–]
The question is then really about which method makes it easier to tune parameters or which helps intuition the most.
gabrielgoh
on April 4, 2017
|
prev
[–]
this is a good way to think about this.
Consider applying for YC's Summer 2025 batch! Applications are open till May 13
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: