00 Preliminares de Álgebra Lineal

00 Preliminares de Álgebra Lineal#

Resumen mínimo de álgebra lineal para el curso

Versión v.3#

El notebook lo puedo modificar, esta versión es la v.1 a 06/11/2024 a 9 am de Caracas.

Autor: Fernando Crema García
Contacto: fernando.cremagarcia@kuleuven.be; fernando.cremagarcia@esat.kuleuven.be

Aprendizaje automático#

Objetivo principal (por ahora) de ML#

El caso simple es asumir que buscamos predecir un valor en base a conocimiento del pasado.

Encontrar una función, denotada como $f$, que mapee los datos de entrada $X$ a las etiquetas de salida $Y$.

\[ Y = f(x) \text{ para } x \in X \]

o de igual forma

\[ f: X \rightarrow Y \]

Donde:

$X$ representa los datos de entrada.
$Y$ son las etiquetas de salida o las respuestas deseadas.
$f(\cdot)$ es la función que el modelo de aprendizaje supervisado busca “aprender”.

3. Vectores#

Con $x \in \mathbb{R}^{n}$, denotamos un vector de $n$ entradas. En nuestro curso, asumimos que un vector es columna.

\[\begin{split} x=\left[\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array}\right] \end{split}\]

3.1 Ejemplo con numpy#

a = [1, 2, 3]

type(a)

list

import numpy as np # scipy, Dataframes -> Pandas, Polars

# Definición de vector
v = np.array([1, 2, 3])

?np.array

Names (objectos)

v2 = np.array(a)

array([1, 2, 3])

v.shape

(3,)

v.reshape((1, 3))

array([[1, 2, 3]])

v2 = np.array([[1], [2], [3]])

v2

array([[1],
       [2],
       [3]])

v2.shape

(3, 1)

# Dot product of vectors
dot_product = np.dot(v, v2)
print("Dot Product:", dot_product)

Dot Product: [14]

3.2 Operaciones básicas#

3.2.1 Suma#

La suma de dos vectores $x$ y $y$ se define como:

\[\begin{split} x + y = \left[\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array}\right] + \left[\begin{array}{c} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{array}\right] = \left[\begin{array}{c} x_{1} + y_{1} \\ x_{2} + y_{2} \\ \vdots \\ x_{n} + y_{n} \end{array}\right] \end{split}\]

Donde $x_i$ y $y_i$ son las entradas respectivas de los vectores.

import numpy as np


# Definir dos vectores
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

# Suma
suma = np.sum([v1, v2], axis=0)

print("Suma/:", suma)

Suma/: [5 7 9]

from numpy import array
from numpy import  sum as suma_ml
# Definir dos vectores
v1 = array([1, 2, 3])
v2 = array([4, 5, 6])

# Suma
suma = np.sum([v1, v2], axis=0)

print("Suma/:", suma)

Suma/: [5 7 9]

suma = np.sum([v1, v2], axis=1)
suma

array([ 6, 15])

suma.shape

(2,)

?np.sum

v = np.array([[4], [5], [6]])
v2 = np.array([[1], [2], [3]])

# Suma
suma = np.sum([v, v2], axis=0)

print("Suma/:", suma)

Suma/: [[5]
 [7]
 [9]]

suma.shape

(3, 1)

3.2.2 Sobrecarga de operadores#

La sobrecarga de operadores es un concepto de programación que permite definir el comportamiento de operadores como +, -, *, /, entre otros, para tipos de datos personalizados.

1.0 + 2

3.0

'a' + 'b'

'ab'

import numpy as np

# Definir dos vectores
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

# Suma
suma = v1 + v2 # por default hace axis=0, np.sum(object, axis=1)

print("Suma/:", suma)

Suma/: [5 7 9]

3.2.3 Resta#

La resta de dos vectores $x$ y $y$ se define como:

\[\begin{split} x - y = \left[\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array}\right] - \left[\begin{array}{c} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{array}\right] = \left[\begin{array}{c} x_{1} - y_{1} \\ x_{2} - y_{2} \\ \vdots \\ x_{n} - y_{n} \end{array}\right] \end{split}\]

Donde $x_i$ y $y_i$ son las entradas respectivas de los vectores.

import numpy as np

# Definir dos vectores
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

v1 - v2

array([-3, -3, -3])

Cómo hacemos en este caso?

import numpy as np

# Definir dos vectores
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

# Suma
resta = np.sum([v1, v2], axis=0)

print("Resta/:", resta)

Resta/: [5 7 9]

3.2.4 Multiplicación escalar#

La multiplicación escalar involucra un número $a \in \mathbb{R}$ y un vector $x$ y se define como:

\[\begin{split} a . y = a . \left[\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array}\right] = \left[\begin{array}{c} a . x_{1} \\ a . x_{2} \\ \vdots \\ a . x_{n} \end{array}\right] \end{split}\]

Donde $x_i$ son las entradas respectivas del vector x.

# Multiplicación escalar
scalar = 2
v1 = np.array([1, 2, 3])
scaled_vector = scalar * v1
print("Scalar Multiplication:", scaled_vector)

Scalar Multiplication: [2 4 6]

3.2.5 Volvemos a la resta#

import numpy as np

# Definir dos vectores
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

# Suma
suma = np.sum([v1, -v2], axis=0)

print("Suma/:", suma)

Suma/: [-3 -3 -3]

3.2.6 [Extra]: Transformar de (3, 1) a (3,) con Flatten#

v = np.array([[4], [5], [6]])
v2 = np.array([[1], [2], [3]])

# Suma
suma = np.sum([v.flatten(), v2.flatten()], axis=0)

print("Suma/:", suma)

Suma/: [5 7 9]

3.3 Propiedades de la Suma y Resta de Vectores#

Conmutatividad:
- Suma: $$\mathbf{x} + \mathbf{y} = \mathbf{x} + \mathbf{y}$$
- Resta: $$ \mathbf{x} - \mathbf{y} \neq \mathbf{y} - \mathbf{x} $$
Asociatividad:
- Suma: $$ (\mathbf{x} + \mathbf{y}) + \mathbf{w} = \mathbf{x} + (\mathbf{y} + \mathbf{w}) $$
- Resta: $$ (\mathbf{x} - \mathbf{y}) - \mathbf{w} \neq \mathbf{x} - (\mathbf{y} - \mathbf{w}) $$
Elemento neutro:
- Suma: Existe un vector $$ \mathbf{0} $$ tal que $$ \mathbf{x} + \mathbf{0} = \mathbf{x} $$
- Resta: No existe un elemento neutro en la resta de vectores.
Inverso aditivo:
- Suma: Para cada vector $$ \mathbf{x} $$ existe un vector $$ -\mathbf{x} $$ tal que $$ \mathbf{x} + (-\mathbf{x}) = \mathbf{0} $$
- Resta: No tiene inverso aditivo definido.
Distributividad:
- Distribución de la suma sobre la suma: $$ \mathbf{x} (\mathbf{y} + \mathbf{w}) = \mathbf{x} \mathbf{y} + \mathbf{x} \mathbf{w} $$
- Distribución de la suma sobre la resta: $$ \mathbf{x} (\mathbf{y} - \mathbf{w}) = \mathbf{x} \mathbf{y} - \mathbf{x} \mathbf{w} $$

3.4 Multiplicación (producto Hadamard)#

El producto hadamard de dos vectores $x$ y $y$ lo definimos como:

\[\begin{split} x \odot y = \left[\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array}\right] \odot \left[\begin{array}{c} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{array}\right] = \left[\begin{array}{c} x_{1} . y_{1} \\ x_{2} . y_{2} \\ \vdots \\ x_{n} . y_{n} \end{array}\right] \end{split}\]

Donde:

$x$ es un vector columna de dimensiones $n \times 1$.
$y$ es un vector columna de dimensiones $n \times 1$.

import numpy as np

# Definir dos vectores
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

# Producto de Hadamard (Podemos usar *)
hadamard_product = np.multiply(v1, v2)

print("Producto de Hadamard:", hadamard_product)

Producto de Hadamard: [ 4 10 18]

v1*v2

array([ 4, 10, 18])

4. Matrices#

Por $A \in \mathbb{R}^{m \times n}$ denotamos una matriz con $m$ filas y $n$ columnas, donde las entradas de $A$ son números reales.

\[\begin{split} A=\left[\begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1 n} \\ a_{21} & a_{22} & \cdots & a_{2 n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m 1} & a_{m 2} & \cdots & a_{m n} \end{array}\right]=\left[\begin{array}{cccc} \mid & \mid & & \mid \\ a^{1} & a^{2} & \cdots & a^{n} \\ \mid & \mid & & \mid \end{array}\right]=\left[\begin{array}{ccc} - & a_{1}^{T} & - \\ - & a_{2}^{T} & - \\ & \vdots \\ - & a_{m}^{T} & - \end{array}\right] \end{split}\]

A = np.matrix('1 2; 3 4')
A

matrix([[1, 2],
        [3, 4]])

A = np.matrix(
    [
        [1, 2],
        [3, 4]
    ]
)
A

matrix([[1, 2],
        [3, 4]])

4.1 Matriz identidad $I$#

La matriz identidad, denotada $I \in \mathbb{R}^{n \times n}$, es una matriz cuadrada con unos en la diagonal y ceros en el resto. Eso es,

\[\begin{split} I_{i j}= \begin{cases}1 & i=j \\ 0 & i \neq j\end{cases} \end{split}\]

Tiene la propiedad de que para todo $A \in \mathbb{R}^{m \times n}$,

\[ A I = A = I A . \]

import numpy as np

A = np.eye(3)
A

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

4.2 Matrices diagonales#

Una matriz diagonal es una matriz donde todos los elementos no diagonales son 0. Esto normalmente se denota como $$D=\operatorname{diag}\left(d_{1}, d_{2}, \ldots, d_{n}\right)$$, con

\[\begin{split} D_{i j}= \begin{cases}d_{i} & i=j \\ 0 & i \neq j\end{cases} \end{split}\]

Claramente, $$I=\operatorname{diag}(1,1, \ldots, 1)$$.

A = np.matrix('1 2 3; 3 4 5')
np.diag(A)

array([1, 4])

A = np.matrix('1 2 3; 3 4 5; 6 7 8')
np.diag(A)

array([1, 4, 8])

4.3 La traspuesta de una matriz $A^{T}$#

La traspuesta de una matriz resulta de “voltear” las filas y columnas. Dada una matriz $A \in \mathbb{R}^{m \times n}$, su traspuesta, denotada como $A^{T} \in \mathbb{R}^{n \times m}$, es una $ n \times m$ matriz cuyas entradas están dadas por

\[ \left(A^{T}\right)_{i j}=A_{j i} \]

Las siguientes propiedades de las traspuestas se verifican fácilmente:

$\left(A^{T}\right)^{T}=A$
$(A B)^{T}=B^{T} A^{T}$
$(A+B)^{T}=A^{T}+B^{T}$

A = np.matrix('1 2 3; 3 4 5; 6 7 8')
A

matrix([[1, 2, 3],
        [3, 4, 5],
        [6, 7, 8]])

np.transpose(A)

matrix([[1, 3, 6],
        [2, 4, 7],
        [3, 5, 8]])

?A # Check en internet

Object `A # Check en internet` not found.

A.T

matrix([[1, 3, 6],
        [2, 4, 7],
        [3, 5, 8]])

?np.matrix

4.4 Traza de una matriz#

La traza de una matriz cuadrada $A \in \mathbb{R}^{n \times n}$, denotada $\operatorname{tr} A$, es la suma de los elementos de la diagonal de la matriz:

\[ \operatorname{tr} A=\sum_{i=1}^{n} A_{i i} \]

4.4.1 Propiedades#

For $A \in \mathbb{R}^{n \times n}, \operatorname{tr} A=\operatorname{tr} A^{T}$.
For $A, B \in \mathbb{R}^{n \times n}, \operatorname{tr}(A+B)=\operatorname{tr} A+\operatorname{tr} B$.
For $A \in \mathbb{R}^{n \times n}, t \in \mathbb{R}, \operatorname{tr}(t A)=t \operatorname{tr} A$.
For $A, B$ such that $A B$ is square, $\operatorname{tr} A B=\operatorname{tr} B A$.
For $A, B, C$ such that $A B C$ is square, $\operatorname{tr} A B C=\operatorname{tr} B C A=\operatorname{tr} C A B$, and so on for the product of more matrices.

5 Funciones sobre vectores#

Por lo general, necesitamos tener herramientas que nos permitan medir características de vectores y matrices. Dependiendo del contexto, usaremos distintas pero lo importante es entender de manera simple qué son? Por lo general, son solo funciones donde la entrada de la misma es un vector o matriz y la salida es un número. Sin embargo, existen funciones donde la salida pueda no ser un número. Sin importar cuál usemos, intentaremos siempre clarificar el motivo, las dimensiones y la manera en cómo se usan,

5.1 Normas de vectores#

Una norma de un vector $\|x\|$ es informalmente una medida de la “longitud” del vector.

Más formalmente, una norma es cualquier función $f: \mathbb{R}^{n} \rightarrow \mathbb{R}$ que satisface 4 propiedades:

(no negatividad) $$\forall x \in \mathbb{R}^{n}, f(x) \geq 0$$
(definitividad) $$f(x)=0 \iff x=0$$
(homogeneidad) $$\forall x \in \mathbb{R}^{n}, t \in \mathbb{R}, f(t x)=|t| f(x)$$
(desigualdad triangular).$$\forall x, y \in \mathbb{R}^{n}, f(x+y) \leq f(x)+f(y)$$

5.1.1 Ejemplos de normas#

La norma euclidiana o $\ell_{2}$ de uso común,

\[ \|x\|_{2}=\sqrt{\sum_{i=1}^{n} x_{i}^{2}} \]

from numpy import linalg as LA
a = np.arange(9) - 4
a

array([-4, -3, -2, -1,  0,  1,  2,  3,  4])

a.ravel()

array([-4, -3, -2, -1,  0,  1,  2,  3,  4])

Ejercicio nuestra propia norma 2

Otros ejemplos

np.sqrt(np.sum(a**2))

7.745966692414834

5.1.2 Numpy#

LA.norm(a)

7.745966692414834

b = a.reshape((3, 3))
b

array([[-4, -3, -2],
       [-1,  0,  1],
       [ 2,  3,  4]])

LA.norm(a)

7.745966692414834

LA.norm(b)

7.745966692414834

La norma $\ell_{1}$,

\[ \|x\|_{1}=\sum_{i=1}^{n}\left|x_{i}\right| \]

LA.norm(b.flatten(), ord=1)

20.0

LA.norm(b, ord=1)

7.0

?LA.norm

La norma $\ell_{\infty}$,

\[ \|x\|_{\infty}=\max _{i}\left|x_{i}\right| . \]

5.2 Generalización de las normas (importante luego)#

In fact, all three norms presented so far are examples of the family of $\ell_{p}$ norms, which are parameterized by a real number $p \geq 1$, and defined as

\[ \|x\|_{p}=\left(\sum_{i=1}^{n}\left|x_{i}\right|^{p}\right)^{1 / p} \]

La norma 0, definida como “número de componentes que NO son cero en un vector”

a = np.array([-1, 0, 9, 0, 0]) # La norma 0 es 2. Selección

6 Funciones sobre matrices#

6.1 La norma Frobenius#

También se pueden definir normas para matrices, como la norma de Frobenius,

\[ \|A\|_{F}=\sqrt{\sum_{i=1}^{m} \sum_{j=1}^{n} A_{i j}^{2}}=\sqrt{\operatorname {tr}\left(A^{T} A\right)} . \]

Existen muchas otras normas, pero están fuera del alcance de esta revisión.

7 Operaciones entre vectores#

7.1 Productos Vector-Vector#

7.1.1 Producto interno o producto punto (inner product or dot product)#

\[\begin{split} x^{T} y \in \mathbb{R}=\left[\begin{array}{llll} x_{1} & x_{2} & \cdots & x_{n} \end{array}\right]\left[\begin{array}{c} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{array}\right]=\sum_{i=1}^{n} x_{i} y_{i} \text{ con } x, y \in \mathbb{R}^{n} \end{split}\]

import numpy as np

x = np.array([1, 2, 3, 4])
y = np.array([4, 3, 2, 1])
np.dot(x, y)

x * y

array([4, 6, 6, 4])

x @ y

np.sum(x * y)

np.dot(x, y)

7.1.2 outer product (Producto externo)#

\[\begin{split} x y^{T} \in \mathbb{R}^{m \times n}=\left[\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{m} \end{array}\right]\left[\begin{array}{llll} y_{1} & y_{2} & \cdots & y_{n} \end{array}\right]=\left[\begin{array}{cccc} x_{1} y_{1} & x_{1} y_{2} & \cdots & x_{1} y_{n} \\ x_{2} y_{1} & x_{2} y_{2} & \cdots & x_{2} y_{n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m} y_{1} & x_{m} y_{2} & \cdots & x_{m} y_{n} \end{array}\right] \end{split}\]

x = np.array([1, 2, 3, 4])
y = np.ones(6)

Entendiendo las dimensiones

# Dimensiones de los vectores
x.shape

(4,)

y.shape

(6,)

np.outer(x, y)

array([[1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4.]])

np.outer(x.reshape(4, 1), y.reshape(6, 1).T)

array([[1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4.]])

np.outer(x.reshape(1, 4), y.reshape(6, 1))

array([[1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4.]])

np.outer(x.reshape(4, 1), y.reshape(6, 1))

array([[1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4.]])

Qué significa “flattened”?

Tomado de Outer product numpy

x.reshape((2, 2)).flatten()

array([1, 2, 3, 4])

from jupyterquiz import display_quiz

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[64], line 1
----> 1 from jupyterquiz import display_quiz

ModuleNotFoundError: No module named 'jupyterquiz'

8 Operaciones entre Matrices y vectores#

8.1 Producto Matriz-Vector (Por filas)#

Si escribimos $A$ por filas, entonces podemos expresar $A x$ como,

\[\begin{split} y=A x=\left[\begin{array}{ccc} - & a_{1}^{T} & - \\ - & a_{2}^{T} & - \\ & \vdots & \\ - & a_{m}^{T} & - \end{array}\right] * x= \left[\begin{array}{ccc} - & a_{1}^{T} & - \\ - & a_{2}^{T} & - \\ & \vdots & \\ - & a_{m}^{T} & - \end{array}\right] \left[\begin{array}{c} x_1 \\ x_2 \\ \vdots \\ x_n \end{array}\right]= \left[\begin{array}{c} a_{1}^{T} x \\ a_{2}^{T} x \\ \vdots \\ a_{m}^{T} x \end{array}\right] \end{split}\]

# Matriz cuadrada random
A = np.random.randint(-5, 5, (3, 3)) # Generar una matriz cuadrada de enteros en el rango [-5, 5]
A

array([[ 2, -1,  4],
       [ 1,  4, -4],
       [ 3,  1, -1]])

x = np.ones(3)

A.dot(x)

array([5., 1., 3.])

A @ x

array([5., 1., 3.])

Slicing en numpy#

La fila 1

A[0, :] # Quiero obtener la fila 0 y todas las columnas de A

array([ 2, -1,  4])

La columna 1

A[:, 0]

array([2, 1, 3])

Los elementos 1 y 2 de la columna 1?

A[0:1, 1] # Rango [limite inferior, limite superior) -> 0, 1 -> 0:2

array([-1])

li = 0
ls = 2

A[lif:lsf, lic:lsc]

array([-1,  4])

Si queremos nosotros hacer nuestra función

\[\begin{split} y=A x=\left[\begin{array}{ccc} - & a_{1}^{T} & - \\ - & a_{2}^{T} & - \\ & \vdots & \\ - & a_{m}^{T} & - \end{array}\right] x=\left[\begin{array}{c} a_{1}^{T} x \\ a_{2}^{T} x \\ \vdots \\ a_{m}^{T} x \end{array}\right] \end{split}\]

def matProdX(A, x):
  result = []
  # Itero desde la fila 0 hasta la fila m-1 (en este caso hasta 2 porque la matriz es 3x3)
  for i in range(A.shape[0]):
    # Para cada fila i, almacenamos a_i.T * x
    result.append(
        A[i, :].dot(x)
    )
  # Retornamos el arreglo con numpy
  return np.array(result)

A.shape[0] # Tuple

(3, 3)

matProdX(A, x)

array([5., 1., 3.])

8.2 Producto Matriz-Vector (Por columnas)#

Si escribimos $A$ por columnas, tenemos:

\[\begin{split} y=A x=\left[\begin{array}{cccc} \mid & \mid & & \mid \\ a^{1} & a^{2} & \ldots & a^{n} \\ \mid & \mid & & \mid \end{array}\right]\left[\begin{array}{c} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{array}\right]=\left[a^{1}\right] x_{1}+\left[a^{2}\right] x_{2}+\ldots+\left[a^{n}\right] x_{n} . \end{split}\]

Noción: $y$ es una combinación lineal de las columnas de $A$.

Por ahora no vamos a extender este concepto pero se conecta con resolución de el sistema $Ax = b$

array([[ 2, -1,  4],
       [ 1,  4, -4],
       [ 3,  1, -1]])

arr1 = [1, 2, 3, 4]
arr2 = [8, 9, 10, 11]

for arr1_i, arr2_i in zip(arr1, arr2):
  print(arr1_i, arr2_i)

for ai in A:
  print(ai)

[ 2 -1  4]
[ 1  4 -4]
[ 3  1 -1]

result = np.zeros(3)
for ai, xi in zip(A, x):
  print(ai, xi)
  result = np.add(result, ai * xi)
result

[ 2 -1  4] 1.0
[ 1  4 -4] 1.0
[ 3  1 -1] 1.0

array([ 6.,  4., -1.])

8.1 Qué pasa?#

result = np.zeros(3)
# Cuando iteramos en numpy, iteramos por filas
# Entonces, iterar las filas de la traspuesta es lo mismo que iterar las columnas de A
for ai, xi in zip(A.T, x):
  result = np.add(result, ai * xi) # np.sum()
result

array([5., 1., 3.])

8.2 Usando listas (list comprehension)#

arr = []
for i in range(0, 11):
  arr.append(i*i)
arr

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

[i*i for i in range(0, 11) if i % 2 == 0] #

[0, 4, 16, 36, 64, 100]

np.sum([x[i]*A[:, i] for i in range(A.shape[1])], axis=0)

array([5., 1., 3.])

8.3 Una mirada a Big data#

A = np.random.randint(-5, 5, (3, 3)) # Generar una matriz cuadrada de enteros en el rango [-5, 5]
A

array([[ 4, -1,  0],
       [ 0,  2,  3],
       [ 4, -5, -5]])

x = np.ones(3)

A @ x

array([ 3.,  5., -6.])

import numpy as np
from functools import reduce

# Haskell
# Hadoop -> Yahoo; Big Data -M MapReduce
# Spark -> Databricks

reduce(np.add, [x[i]*A[:, i] for i in range(A.shape[1])])

array([ 3.,  5., -6.])

reduce(np.add, map(lambda xi, ai: xi*ai, x, A.T))

array([ 3.,  5., -6.])

8.3 Producto Matriz-Vector (Izquierda vector fila)#

También es posible multiplicar a la izquierda por un vector fila.

Si escribimos $A$ por columnas, entonces podemos expresar $x^{\top} A$ como,

\[\begin{split} y^{T}=x^{T} A=x^{T}\left[\begin{array}{cccc} \mid & \mid & & \mid \\ a^{1} & a^{2} & \cdots & a^{n} \\ \mid & \mid & & \mid \end{array}\right]=\left[\begin{array}{llll} x^{T} a^{1} & x^{T} a^{2} & \cdots & x^{T} a^{n} \end{array}\right] \end{split}\]

# Ustedes

8.4 Product Matriz-Vector#

También es posible multiplicar a la izquierda por un vector fila.

expresando $A$ en términos de filas tenemos:

\[\begin{split} \begin{aligned} y^{T}=x^{T} A & =\left[\begin{array}{llll} x_{1} & x_{2} & \cdots & x_{m} \end{array}\right]\left[\begin{array}{ccc} - & a_{1}^{T} & - \\ - & a_{2}^{T} & - \\ & \vdots & \\ - & a_{m}^{T} & - \end{array}\right] \\ & =x_{1}\left[\begin{array}{lll} - & a_{1}^{T} & - \end{array}\right]+x_{2}\left[\begin{array}{lll} - & a_{2}^{T} & - \end{array}\right]+\ldots+x_{m}\left[\begin{array}{lll} - & a_{m}^{T} & - \end{array}\right] \end{aligned} \end{split}\]

$y^{T}$ es una combinación lineal de las filas de $A$.

# Ustedes

9 Operaciones entre Matrices#

9.1 Multiplicación Matriz-Matriz: Productos punto#

Como un conjunto de productos vector-vector (producto punto)

\[\begin{split} C=A B=\left[\begin{array}{ccc} - & a_{1}^{T} & - \\ - & a_{2}^{T} & - \\ & \vdots & \\ - & a_{m}^{T} & - \end{array}\right]\left[\begin{array}{cccc} \mid & \mid & & \mid \\ b^{1} & b^{2} & \cdots & b^{p} \\ \mid & \mid & & \mid \end{array}\right]=\left[\begin{array}{cccc} a_{1}^{T} b^{1} & a_{1}^{T} b^{2} & \cdots & a_{1}^{T} b^{p} \\ a_{2}^{T} b^{1} & a_{2}^{T} b^{2} & \cdots & a_{2}^{T} b^{p} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m}^{T} b^{1} & a_{m}^{T} b^{2} & \cdots & a_{m}^{T} b^{p} \end{array}\right] . \end{split}\]

Cuál es la condición en las dimensiones de A y B para que exista A*B?

El número de columnas de A sea igual al número de filas de B.

# Matriz cuadrada random
A = np.random.randint(-5, 5, (3, 3))
A

array([[ 1, -3, -2],
       [ 3,  2, -5],
       [-1,  4, -3]])

B = np.random.randint(-5, 5, (3, 3))
B

array([[-4,  0,  2],
       [ 2, -5,  2],
       [-2,  4, -5]])

A @ B

array([[ -6,   7,   6],
       [  2, -30,  35],
       [ 18, -32,  21]])

A.dot(B)

array([[ -6,   7,   6],
       [  2, -30,  35],
       [ 18, -32,  21]])

9.2 Multiplicación Matriz-Matriz: Suma de productos externos#

Como la suma de productos externos (relacionado con la factorización SVD)

\[\begin{split} C=A B=\left[\begin{array}{cccc} \mid & \mid & & \mid \\ a^{1} & a^{2} & \cdots & a^{p} \\ \mid & \mid & & \mid \end{array}\right]\left[\begin{array}{ccc} - & b_{1}^{T} & - \\ - & b_{2}^{T} & - \\ & \vdots & \\ - & b_{p}^{T} & - \end{array}\right]=\sum_{i=1}^{p} a^{i} b_{i}^{T} . \end{split}\]

Qué es p?

9.2.1 Intentemos programarlo#

array([[ 1, -3, -2],
       [ 3,  2, -5],
       [-1,  4, -3]])

array([[-4,  0,  2],
       [ 2, -5,  2],
       [-2,  4, -5]])

A@B

array([[ -6,   7,   6],
       [  2, -30,  35],
       [ 18, -32,  21]])

np.outer(A[:, 0], B[0, :])

array([[ -4,   0,   2],
       [-12,   0,   6],
       [  4,   0,  -2]])

np.outer(A[:, 1], B[1, :])

array([[ -6,  15,  -6],
       [  4, -10,   4],
       [  8, -20,   8]])

np.outer(A[:, 2], B[2, :])

array([[  4,  -8,  10],
       [ 10, -20,  25],
       [  6, -12,  15]])

resultado = np.zeros((3, 3))
for i in range(3):
  print(np.outer(A[:, i], B[i, :]))
  resultado = np.add(resultado, np.outer(A[:, i], B[i, :]))

[[ -4   0   2]
 [-12   0   6]
 [  4   0  -2]]
[[ -6  15  -6]
 [  4 -10   4]
 [  8 -20   8]]
[[  4  -8  10]
 [ 10 -20  25]
 [  6 -12  15]]

resultado

array([[ -6.,   7.,   6.],
       [  2., -30.,  35.],
       [ 18., -32.,  21.]])

9.2.2 Respuesta#

np.sum([np.outer(A[:, i], B[i,:]) for i in range(A.shape[1])], axis=0)

array([[ -6,   7,   6],
       [  2, -30,  35],
       [ 18, -32,  21]])

9.3 Multiplicación Matriz-Matriz: A por columnas de B#

Como un conjunto de productos matriz-vector.

\[\begin{split} C=A B=A\left[\begin{array}{cccc} \mid & \mid & & \mid \\ b^{1} & b^{2} & \ldots & b^{n} \\ \mid & \mid & & \mid \end{array}\right]=\left[\begin{array}{cccc} \mid & \mid & & \mid \\ A b^{1} & A b^{2} & \ldots & A b^{n} \\ \mid & \mid & & \mid \end{array}\right] \end{split}\]

Aquí la i-ésima columna de $C$ está dada por el producto matriz-vector con el vector de la derecha, $c_{i}=A b_{i}$. Estos productos matriz-vector pueden, a su vez, interpretarse utilizando ambos puntos de vista dados en la subsección anterior.

# UStedes

9.4 Multiplicación Matriz-Matriz: Vector-Matriz de filas de A#

Como un conjunto de productos vector-matriz de las filas de A.

\[\begin{split} C=A B=\left[\begin{array}{ccc} - & a_{1}^{T} & - \\ - & a_{2}^{T} & - \\ & \vdots & \\ - & a_{m}^{T} & - \end{array}\right] B=\left[\begin{array}{ccc} - & a_{1}^{T} B & - \\ - & a_{2}^{T} B & - \\ & \vdots & \\ - & a_{m}^{T} B & - \end{array}\right] \end{split}\]

# Ustedes

9.5 Multiplicación Matriz-Matriz: propiedades#

Asociativa: $(A B) C=A(B C)$.
Distributiva: $A(B+C)=A B+A C$.
En general, no conmutativo; es decir, puede darse el caso de que $A B \neq B A$. (Por ejemplo, si $A \in \mathbb{R}^{m \times n}$ y $B \in \mathbb{R}^{n \times q}$, el producto matricial $B A$ ni siquiera existe si $m$ y $q$ no son iguales!)