You need three indices for the parameter so the easiest way to so that is to stack the original 2D dataframe into a regular dataframe with two indices and one data column. This makes it easy to add the third index corresponding to time.
I understood,Do I have to define the names of the columns? What is the meaning of ‘S’?
S
is the name of the parameter that will receive the data. In your case you should replace "S"
by "Sigma"
. With Sigma
defined as param Sigma{Time, A, A};
.
Sigma is not defined anymore, Isn’t it the new sigma df? I want to add 19 stocks,calculation of Variance will do with 19 stocks. I need 19*19 matrix and time. After calculation of df I will write AMPL .
What is the definition of the parameter you want to load the data into?
The dataframe in the screenshot below contains covariances for pairs of stocks for each time window:
This problem really hard for me. But I like this topic.Thanks for helping me.
I want to minimize variance,in rebalancing problem I will do it with time parameter.In mean variance Markowitz he did just one period,I want to solve this problem with multiperiod rebalancing.So time and whole data changes with the rolling window.
In that model you have Sigma[t,i,j]
so you need to adjust the definition of Sigma to param Sigma{Time, A, A};
since the first index is Time. With this definition you can load a dataframe with the shape shown in my previous reply.
But I got mismatch error in the df.column?
The dataframe in the screenshot:
has pairs of stocks as index values and not in the columns. There are just 4 columns, three for indices and one for values. This is the shape needed for Sigma indexed of {Time, A, A}
.
You got the error by trying to set more column names than the number of columns in the dataframe.
But I need to write my stock names. I need to adjust for 19 stocks.Stock1 is a stock name,right?
No, the names of the stcoks are DIS, KO, XOM, etc.; Stock1 and Stock2 are the names of the index columns. In order to load the names of your stocks you need to do that to your original covariance dataframe.
In Google Colab you can see that being done with pd.DataFrame(ef.cov_matrix, index=ef.tickers, columns=ef.tickers)
.
Did you have the names in the dataframe data
? The notebook at Google Colab calculates optimizes the average variance scross the last 26 weeks. The code is as follows:
from pypfopt import expected_returns, risk_models
from datetime import datetime, timedelta
from amplpy import AMPL
import yfinance as yf
import numpy as np
import pandas as pd
import yfinance as yf
tickers = [
"HD", "MCD", "NKE", "KO", "PG", "SYY", "WMT", # Consumer Staples
"CVX", "XOM", # Energy
"AXP", "JPM", # Financials
"JNJ", "MRK", "PFE", "WBA", # Health Care
"BA", "CAT", "MMM", # Industrials
]
end_date = datetime.now().date()
start_date = end_date - timedelta(weeks=26)
ohlc = yf.download(tickers, start=start_date, end=end_date)
prices = ohlc["Adj Close"].dropna(how="all")
n_slices = 26
# display(prices)
slices = np.array_split(prices, n_slices)
dfs = []
for i, slice_df in enumerate(slices):
df = risk_models.risk_matrix(slice_df, method="sample_cov").stack().to_frame()
df.reset_index(inplace=True) # Turn the index into regular data columns
df.columns = ["Stock1", "Stock2", "S"] # Adjust column names
df["Time"] = i # Add new column with the index of the slice
dfs.append(df)
sigma = pd.concat(dfs) # Concatenate all dataframes
sigma.set_index(["Time", "Stock1", "Stock2"], inplace=True) # Set the index to be (Time, Stock1, Stock2)
# display(sigma)
ampl = AMPL()
ampl.eval(
r"""
set Time;
set A ordered;
param Sigma{Time, A, A};
param lb default 0;
param ub default 1;
var w{A} >= lb <= ub;
minimize average_portfolio_variance:
(sum {t in Time} sum {i in A, j in A} w[i] * Sigma[t, i, j] * w[j]) / card(Time);
s.t. portfolio_weights:
sum {i in A} w[i] = 1;
"""
)
ampl.set["Time"] = range(n_slices)
ampl.set["A"] = tickers
ampl.param["Sigma"] = sigma
ampl.option["solver"] = "gurobi"
ampl.solve()
print("minimal average variance:", ampl.get_value("average_portfolio_variance"))
ampl.get_data("w").to_pandas().plot.barh()
It produces the following output:
I think your solution is correct, but I’m confused.Why you divide card(Time) in minimization formula?The weight of asset i in time period t is denoted by wit,so I think weight of asset depend on t parameter then I wrote:
ampl = AMPL()
ampl.eval(
r"“”
set Time;
set A ordered;
param Sigma{Time, A, A};
param lb default 0;
param ub default 1;
var w{Time,A} >= lb <= ub;
minimize average_portfolio_variance:
(sum {t in Time} sum {i in A, j in A} w[t,i] * Sigma[t, i, j] * w[t,j]) / card(Time);
s.t. portfolio_weights {t in Time}:
sum {i in A} w[t,i] = 1;
“”"
)
ampl.set[“Time”] = range(n_slices)
ampl.set[“A”] = tickers
ampl.param[“Sigma”] = sigma
ampl.option[“solver”] = “gurobi”
ampl.solve()
print(“minimal average variance:”, ampl.get_value(“average_portfolio_variance”))
ampl.get_data(“w”).to_pandas().plot.barh()
then I got:
In the model I wrote, I minimized the average variance if the weights were equal during all periods. It was just to show how to load the data. Your model minimizes the average variance while allowing different weights for each time period. It just depends on what you want to model.
tau=240
for t in range(tau, len(data)):
stocks = [
‘HD’, ‘MCD’, ‘NKE’, ‘KO’, ‘PG’, ‘SYY’, ‘WMT’, # Consumer Staples
‘CVX’, ‘XOM’, # Energy
‘AXP’, ‘JPM’, # Financials
‘JNJ’, ‘MRK’, ‘PFE’, ‘WBA’, # Health Care
‘BA’, ‘CAT’, ‘MMM’, # Industrials
‘IRX’,# Information Technology#t-bill
]
stocks_in_sample = data[stocks][t - tau : t]
slices = np.array_split(data[stocks], 240)
dfs =
for i, slice_df in enumerate(slices):
df = risk_models.risk_matrix(slice_df, method=“sample_cov”).stack().to_frame()
df.reset_index(inplace=True) # Turn the index into regular data columns
df.columns = [“Stock1”, “Stock2”, “S”] # Adjust column names
df[“Time”] = i # Add new column with the index of the slice
dfs.append(df)
sigma = pd.concat(dfs) # Concatenate all dataframes
sigma.set_index([“Time”, “Stock1”, “Stock2”], inplace=True) # Set the index to be (Time, Stock1, Stock2)
display(sigma)
I want to define for loop for iteration rolling window above the AMPL code.First,I wrote stocks_in_sample.values in the enumerate() but does not work than I wrote above code,
That error typically happens if the AMPL process is killed due to lack of enough memory available on the machine. Can you run that as you check memory usage and see if it is hitting the memory limit?
I have enough memory in my machine,what I need to do? Also ı do not want slicing ı want iteration like rolling window