Implementación de Descenso de Gradiente en octava

Question

Implementación de Descenso de Gradiente en octava

En realidad he estado luchando contra esto durante como 2 meses. ¿Qué es lo que los hace diferentes?

hypotheses= X * theta
temp=(hypotheses-y)'
temp=X(:,1) * temp
temp=temp * (1 / m)
temp=temp * alpha
theta(1)=theta(1)-temp

hypotheses= X * theta
temp=(hypotheses-y)'
temp=temp * (1 / m)
temp=temp * alpha
theta(2)=theta(2)-temp



theta(1) = theta(1) - alpha * (1/m) * ((X * theta) - y)' * X(:, 1);
theta(2) = theta(2) - alpha * (1/m) * ((X * theta) - y)' * X(:, 2);

Este último funciona. No estoy seguro de por qué..Me cuesta entender la necesidad de la matriz inversa .

32

octave

Author: narthur157, 2012-05-15

Source

5 answers

En la primera, si X fuera una matriz de 3x2 y theta fuera una matriz de 2x1, entonces "hipótesis" sería una matriz de 3x1.

Asumiendo que y es una matriz de 3x1, entonces puede realizar (hipótesis - y) y obtener una matriz de 3x1, entonces la transposición de esa matriz de 3x1 es una matriz de 1x3 asignada a temp.

Entonces la matriz 1x3 se establece en theta(2), pero esto no debe ser una matriz.

Las dos últimas líneas de tu código funcionan porque, usando mis ejemplos mxn anteriores,

(X * theta)

Sería un 3x1 matriz.

Entonces esa matriz 3x1 se resta por y (una matriz 3x1) y el resultado es una matriz 3x1.

(X * theta) - y

Así que la transposición de la matriz 3x1 es una matriz 1x3.

((X * theta) - y)'

Finalmente, una matriz de 1x3 por una matriz de 3x1 será igual a una matriz escalar o 1x1, que es lo que está buscando. Estoy seguro de que ya lo sabías, pero solo para ser exhaustivo,la X (:, 2) es la segunda columna de la matriz 3x2, por lo que es una matriz de 3x1.

5

Author: Justin Nafe,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2012-05-16 22:47:40

Cuando actualices necesitas hacer como

Start Loop {

temp0 = theta0 - (equation_here);

temp1 = theta1 - (equation_here);


theta0 =  temp0;

theta1 =  temp1;

} End loop

3

Author: hbr,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2013-10-22 20:42:20

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
% Performs gradient descent to learn theta. Updates theta by taking num_iters 
% gradient steps with learning rate alpha.

% Number of training examples
m = length(y); 
% Save the cost J in every iteration in order to plot J vs. num_iters and check for convergence 
J_history = zeros(num_iters, 1);

for iter = 1:num_iters
    h = X * theta;
    stderr = h - y;
    theta = theta - (alpha/m) * (stderr' * X)';
    J_history(iter) = computeCost(X, y, theta);
end

end

2

Author: skeller88,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2017-10-18 21:45:37

.
.
.
.
.
.
.
.
.
Spoiler alert












m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters

% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
%               theta. 
%
% Hint: While debugging, it can be useful to print out the values
%       of the cost function (computeCost) and gradient here.
% ========================== BEGIN ===========================


t = zeros(2,1);
J = computeCost(X, y, theta);
t = theta - ((alpha*((theta'*X') - y'))*X/m)';
theta = t;
J1 = computeCost(X, y, theta);

if(J1>J),
    break,fprintf('Wrong alpha');
else if(J1==J)
    break;
end;


% ========================== END ==============================

% Save the cost J in every iteration    
J_history(iter) = sum(computeCost(X, y, theta));
end
end

-8

Author: user2696258,
Warning: date(): Invalid date.timezone value 'Europe/Kyiv', we selected the timezone 'UTC' for now. in /var/www/agent_stack/data/www/ajaxhispano.com/template/agent.layouts/content.php on line 61
2015-08-12 02:36:36

score 59 · Accepted Answer

Lo que estás haciendo en el primer ejemplo en el segundo bloque te has perdido un paso ¿no? Estoy asumiendo que concatenado X con un vector de unos.

   temp=X(:,2) * temp

El último ejemplo funcionará, pero se puede vectorizar aún más para ser más simple y eficiente.

Asumí que solo tienes 1 función. funcionará de la misma manera con múltiples características, ya que todo lo que sucede es que agrega una columna adicional a su matriz X para cada característica. Básicamente se agrega un vector de unos a x a vectoriza la intercepción.

Puede actualizar una matriz 2x1 de thetas en una línea de código. Con x concatenar un vector de unos por lo que es una matriz nx2 entonces usted puede calcular h (x) multiplicando por el vector theta (2x1), esto es (X * theta) bit.

La segunda parte de la vectorización es transponer (X * theta) - y) que le da una matriz 1*n que cuando se multiplica por X (una matriz n*2) básicamente agregará ambos (h(x)-y)x0 y (h(x)-y)x1. Por definición ambos thetas se hacen en al mismo tiempo. Esto resulta en una matriz de 1*2 de mi nuevo theta que sólo transponer de nuevo para voltear alrededor del vector para ser las mismas dimensiones que el vector theta. Entonces puedo hacer una simple multiplicación escalar por alfa y resta vectorial con theta.

X = data(:, 1); y = data(:, 2);
m = length(y);
X = [ones(m, 1), data(:,1)]; 
theta = zeros(2, 1);        

iterations = 2000;
alpha = 0.001;

for iter = 1:iterations
     theta = theta -((1/m) * ((X * theta) - y)' * X)' * alpha;
end