Portfolio optimization

Firdevs_Uykun · December 22, 2023, 9:41pm

I want to extend columns,19 stocks,Why is this not possible?

fdabrandao · December 22, 2023, 9:46pm

You need three indices for the parameter so the easiest way to so that is to stack the original 2D dataframe into a regular dataframe with two indices and one data column. This makes it easy to add the third index corresponding to time.

Firdevs_Uykun · December 22, 2023, 9:54pm

I understood,Do I have to define the names of the columns? What is the meaning of ‘S’?

fdabrandao · December 22, 2023, 10:02pm

S is the name of the parameter that will receive the data. In your case you should replace "S" by "Sigma". With Sigma defined as param Sigma{Time, A, A};.

Firdevs_Uykun · December 22, 2023, 10:05pm

Sigma is not defined anymore, Isn’t it the new sigma df? I want to add 19 stocks,calculation of Variance will do with 19 stocks. I need 19*19 matrix and time. After calculation of df I will write AMPL .

fdabrandao · December 22, 2023, 10:10pm

What is the definition of the parameter you want to load the data into?

The dataframe in the screenshot below contains covariances for pairs of stocks for each time window:

Firdevs_Uykun · December 22, 2023, 10:14pm

This problem really hard for me. But I like this topic.Thanks for helping me.

I want to minimize variance,in rebalancing problem I will do it with time parameter.In mean variance Markowitz he did just one period,I want to solve this problem with multiperiod rebalancing.So time and whole data changes with the rolling window.

fdabrandao · December 22, 2023, 10:26pm

In that model you have Sigma[t,i,j] so you need to adjust the definition of Sigma to param Sigma{Time, A, A}; since the first index is Time. With this definition you can load a dataframe with the shape shown in my previous reply.

Firdevs_Uykun · December 22, 2023, 10:28pm

But I got mismatch error in the df.column?

fdabrandao · December 22, 2023, 10:42pm

The dataframe in the screenshot:

has pairs of stocks as index values and not in the columns. There are just 4 columns, three for indices and one for values. This is the shape needed for Sigma indexed of {Time, A, A}.

You got the error by trying to set more column names than the number of columns in the dataframe.

Firdevs_Uykun · December 22, 2023, 10:49pm

But I need to write my stock names. I need to adjust for 19 stocks.Stock1 is a stock name,right?

fdabrandao · December 22, 2023, 10:59pm

No, the names of the stcoks are DIS, KO, XOM, etc.; Stock1 and Stock2 are the names of the index columns. In order to load the names of your stocks you need to do that to your original covariance dataframe.

In Google Colab you can see that being done with pd.DataFrame(ef.cov_matrix, index=ef.tickers, columns=ef.tickers).

Firdevs_Uykun · December 23, 2023, 6:25am

I used exactly your code but my result is different

Firdevs_Uykun · December 23, 2023, 6:27am

Then when I look just df:

fdabrandao · December 23, 2023, 12:23pm

Did you have the names in the dataframe data? The notebook at Google Colab calculates optimizes the average variance scross the last 26 weeks. The code is as follows:

from pypfopt import expected_returns, risk_models
from datetime import datetime, timedelta
from amplpy import AMPL
import yfinance as yf
import numpy as np
import pandas as pd
import yfinance as yf

tickers = [
    "HD", "MCD", "NKE", "KO", "PG", "SYY", "WMT", # Consumer Staples
    "CVX", "XOM", # Energy
    "AXP", "JPM", # Financials
    "JNJ", "MRK", "PFE", "WBA", # Health Care
    "BA", "CAT", "MMM", # Industrials
]
end_date = datetime.now().date()
start_date = end_date - timedelta(weeks=26)
ohlc = yf.download(tickers, start=start_date, end=end_date)
prices = ohlc["Adj Close"].dropna(how="all")

n_slices = 26
# display(prices)
slices = np.array_split(prices, n_slices)

dfs = []
for i, slice_df in enumerate(slices):
    df = risk_models.risk_matrix(slice_df, method="sample_cov").stack().to_frame()
    df.reset_index(inplace=True) # Turn the index into regular data columns
    df.columns = ["Stock1", "Stock2", "S"] # Adjust column names
    df["Time"] = i  # Add new column with the index of the slice
    dfs.append(df)

sigma = pd.concat(dfs)  # Concatenate all dataframes
sigma.set_index(["Time", "Stock1", "Stock2"], inplace=True)  # Set the index to be (Time, Stock1, Stock2)
# display(sigma)

ampl = AMPL()
ampl.eval(
    r"""
    set Time;
    set A ordered;
    param Sigma{Time, A, A};
    param lb default 0;
    param ub default 1;
    var w{A} >= lb <= ub;
    minimize average_portfolio_variance:
        (sum {t in Time} sum {i in A, j in A} w[i] * Sigma[t, i, j] * w[j]) / card(Time);
    s.t. portfolio_weights:
        sum {i in A} w[i] = 1;
    """
)
ampl.set["Time"] = range(n_slices)
ampl.set["A"] = tickers
ampl.param["Sigma"] = sigma
ampl.option["solver"] = "gurobi"
ampl.solve()
print("minimal average variance:", ampl.get_value("average_portfolio_variance"))
ampl.get_data("w").to_pandas().plot.barh()

It produces the following output:

Firdevs_Uykun · December 24, 2023, 12:06pm

I think your solution is correct, but I’m confused.Why you divide card(Time) in minimization formula?The weight of asset i in time period t is denoted by wit,so I think weight of asset depend on t parameter then I wrote:

ampl = AMPL()
ampl.eval(
r"“”
set Time;
set A ordered;
param Sigma{Time, A, A};
param lb default 0;
param ub default 1;
var w{Time,A} >= lb <= ub;
minimize average_portfolio_variance:
(sum {t in Time} sum {i in A, j in A} w[t,i] * Sigma[t, i, j] * w[t,j]) / card(Time);
s.t. portfolio_weights {t in Time}:
sum {i in A} w[t,i] = 1;
“”"
)
ampl.set[“Time”] = range(n_slices)
ampl.set[“A”] = tickers
ampl.param[“Sigma”] = sigma
ampl.option[“solver”] = “gurobi”
ampl.solve()
print(“minimal average variance:”, ampl.get_value(“average_portfolio_variance”))
ampl.get_data(“w”).to_pandas().plot.barh()
then I got:

fdabrandao · December 27, 2023, 10:42am

In the model I wrote, I minimized the average variance if the weights were equal during all periods. It was just to show how to load the data. Your model minimizes the average variance while allowing different weights for each time period. It just depends on what you want to model.

Firdevs_Uykun · December 27, 2023, 10:47pm

tau=240
for t in range(tau, len(data)):
stocks = [
‘HD’, ‘MCD’, ‘NKE’, ‘KO’, ‘PG’, ‘SYY’, ‘WMT’, # Consumer Staples
‘CVX’, ‘XOM’, # Energy
‘AXP’, ‘JPM’, # Financials
‘JNJ’, ‘MRK’, ‘PFE’, ‘WBA’, # Health Care
‘BA’, ‘CAT’, ‘MMM’, # Industrials
‘IRX’,# Information Technology#t-bill
]
stocks_in_sample = data[stocks][t - tau : t]
slices = np.array_split(data[stocks], 240)
dfs =
for i, slice_df in enumerate(slices):
df = risk_models.risk_matrix(slice_df, method=“sample_cov”).stack().to_frame()
df.reset_index(inplace=True) # Turn the index into regular data columns
df.columns = [“Stock1”, “Stock2”, “S”] # Adjust column names
df[“Time”] = i # Add new column with the index of the slice
dfs.append(df)
sigma = pd.concat(dfs) # Concatenate all dataframes
sigma.set_index([“Time”, “Stock1”, “Stock2”], inplace=True) # Set the index to be (Time, Stock1, Stock2)
display(sigma)
I want to define for loop for iteration rolling window above the AMPL code.First,I wrote stocks_in_sample.values in the enumerate() but does not work than I wrote above code,

fdabrandao · December 27, 2023, 10:53pm

That error typically happens if the AMPL process is killed due to lack of enough memory available on the machine. Can you run that as you check memory usage and see if it is hitting the memory limit?

Firdevs_Uykun · December 27, 2023, 11:03pm

I have enough memory in my machine,what I need to do? Also ı do not want slicing ı want iteration like rolling window

Topic		Replies	Views
Solving simple stochastic optimization problems with AMPL Support	1	158	January 18, 2024
One_period_mean variance Support	7	165	January 6, 2024
MAD Portfolio optimization Support	3	158	January 3, 2024
Multiperiod_min_variance Support	1	126	January 1, 2024
M.var["w"].to_pandas() does not work Support	1	152	January 5, 2024

Portfolio optimization

Related topics