On a special case of non-symmetric resource extraction games with unbounded payoffs

Article History: Received 31 December 2020 Accepted 16 June 2021 Available 14 September 2021 The game of resource extraction/capital accumulation is a stochastic infinitehorizon game, which models a joint utilization of a productive asset over time. The paper complements the available results on pure Markov perfect equilibrium existence in the non-symmetric game setting with an arbitrary number of agents. Moreover, we allow that the players have unbounded utilities and relax the assumption that the stochastic kernels of the transition probability must depend only on the amount of resource before consumption. This class of the game has not been examined beforehand. However, we could prove the Markov perfect equilibrium existence only in the specific case of interest. Namely, when the players have constant relative risk aversion (CRRA) power utilities and the transition law follows a geometric random walk in relation to the joint investment. The setup with the chosen characteristics is motivated by economic considerations, which makes it relevant to a certain range of real-word problems.


Introduction
The game of resource extraction (also known as the capital accumulation game) belongs to a class of nonzero-sum stochastic infinite-horizon games. It is also an extension of the famous discrete-time one-sector optimal growth model (see [1,2]) to a strategic interaction of competing agents. The seminal study on the topic is by Levhari and Mirman [3], in which the authors considered a twoagent deterministic version of the game with the identical logarithmic one-period utilities of the players and the Cobb-Douglas production function regulating the resource quantity. The existence of a non-randomized stationary Nash equilibrium in a deterministic game setting was later established by Sundaram [4]. It is worth mentioning that some extensions to the game in its deterministic formulation are also being studied nowadays (e.g., a recent paper [5] on fishery extraction with more than one species). The result of Sundaram relied on the assumptions that the preferences of the players are identical and bounded in the state space, i.e., the space of all possible resource stocks. Both of these assumptions were also helpful in reporting the existence of a stationary Nash equilibrium in different stochastic frameworks of the game. The condition that the players have the same preferences makes a game symmetric. For the results on a stationary Nash equilibrium existence in the symmetric setup of resource extraction games the reader is referred to [6][7][8][9][10]. Studies [11,12] tackled the problem in the non-symmetric case while assuming that the preferences of the players are bounded. Such condition was also important for studying nonsymmetric stochastic games in a general context [13]. Moreover, the existing literature on nonsymmetric supermodular stochastic games relies on the assumptions of either a bounded state space [14,15] or bounded utilities [16].

*Corresponding Author
Finally, some partial results of the Nash equilibrium existence in non-symmetric resource extraction games with unbounded payoffs were obtained in [17,18]. These extensions were achieved at the cost of additional structural assumptions. In the paper of Amir [17] the model does not only require that the spaces of player's actions are bounded by some constants, but also includes a very restrictive convexity assumption on the transition probability. A discussion on the relevance of such transition was covered in [18], concluding that it only makes sense in the case of a bounded state space. The approach in [18] by Jaśkiewicz and Nowak is more general. Authors set the transition probability to be a convex combination of stochastic kernels which depend on the state variable, or, in other words, the resource stock before consumption. Coefficients of the combination, as in the case of [17], are allowed to depend on a joint investment of players. However, the idea that a joint investment, which is the amount of the resource left after consumption, can influence only the coefficients was still limiting. In this study we extend our view to the unexamined class of non-symmetric resource extraction games. In which, not only the players' utilities are unbounded, but the transition probability is a Markov kernel dependent on a joint investment. Moreover, the number of players in the game may be larger than two. At the same time, we restrict our attention to a specific form of the preferences and a concrete stochastic production function, the choices of which are motivated by economic considerations. Such settings, on which we elaborate below, enable us to prove the existence of a non-randomized stationary Markov perfect equilibrium in the game. Our first assumption is that the utilities of the players are concave power functions. This type of preferences belongs to the Constant Relative Risk Aversion (CRRA) family. CRRA is a typical example of an established unbounded utility, and it is commonly used in economics. The arguments favouring exploiting CRRA utility functions are often supported by the existing empirical studies on investors' behaviour (see [19][20][21]). The second assumption is that the transition law is determined by the state equation s ′ = p · ξ, where p is a joint investment of players and ξ is a random shock, which adds a stochastic nature to the process. The model with multiplicative random shocks is known in economic literature as a geometric random walk. It is widely used in forecasting, especially for stock market data, where it is the default model. In order to avoid dealing with infinite values of players' expected discounted rewards within the structure, an additional natural constraint is imposed on the random variable ξ, restricting the growth rate of the stock. The main result is presented as a theorem, the proof of which is based on the optimality principle of discounted dynamic programming. The preceding lemmata is aimed at finding the value functions which solve the corresponding Bellman equation for every player.

The model
A nonzero-sum m-person stochastic resource extraction game is described by the properties: The stochastic law of motion among states is described by the state equation where s is the previous state of the game, (x 1 , . . . , x m ) ∈ D(s) are players' feasible decisions, ξ is a random disturbance, and M is a continuous function with the property M (0, (0, . . . , 0), ·) = 0, meaning that s = 0 is an absorbing state.
The game is interpreted as follows. Several agents (numbered 1 to m) jointly own a productive asset, the evolution of which is represented by the state variable. At each of infinitely many stages of the game, players observe the state s ∈ S and simultaneously choose their actions (x 1 , . . . , x m ) ∈ A(s) m , expressing which part of the available stock each of them wishes to utilize for consumption. Provided that the actions are feasible, i.e., (x 1 , . . . , x m ) ∈ D(s), players receive their appropriate utilities u 1 (x 1 ), . . . , u m (x m ), and the game moves to the next stage, where the new state is obtained from a stochastic technology M (s, x 1 , . . . , x m , ξ), the output of which depends on the realization of the random variable ξ on a probability space, drawn independently at every stage of the game. If the actions (x 1 , . . . , x m ) happen to be infeasible, then players must revise their decisions. Therefore, we will restrict our attention only to strategies which generate feasible actions. Let I := {1, 2, . . . , m}. Our further assumptions to the model are: where the random variable ξ takes values in [0; +∞) with a probability distribution that is known to the players, and whose expectation is E.
Player's general strategy is a Borel mapping from the space of all possible histories of the game to the space of available actions. The set of all strategies for player i ∈ I is denoted by Π i . A strategy profile π = (π 1 , . . . , π m ) ∈ Π 1 × . . . × Π m is called feasible if for any state s ∈ S and every possible sequence of preceding states and actions h, the vector π 1 (h, s), . . . , π m (h, s) ∈ D(s). Let F be the set of all Borel measurable functions f : S → S such that f (s) ∈ A(s) for every s ∈ S. A stationary Markov strategy for player i is a constant sequence (π it ) t∈N , where π it = f i ∈ F for all t ∈ N. Thus, a stationary Markov strategy for a player can be identified with a mapping f ∈ F . We will say that a stationary Markov strategy profile (f 1 , . . . , f m ) is feasible if and only if it belongs to the space Let H := D × D × D × . . . be the space of all infinite histories of the game. For every initial state s 1 = s ∈ S and any feasible strategy profile π = (π 1 , . . . , π m ) we can define a probability measure P π s and a stochastic process {S t , X t } on H in a canonical way (see Chapter 7 in [22]), where S t and X t are random variables describing respectively the state and the action profile at time t. Then, for each initial state s ∈ S and a feasible strategy profile π, player i's expected discounted reward is where X ti is the i-th coordinate of the random vector X t and E π s is the expectation operator with respect to the probability measure P π s . Notation: Letȳ = (y 1 , . . . , y m ) be a vector with coordinates belonging to some set Y . If z i ∈ Y , then (z i ,ȳ −i ) signifies the vectorȳ with coordinate y i replaced by z i . A feasible strategy profile π * = (π * 1 , . . . , π * m ) ∈ Π 1 × . . . × Π m is called a Nash equilibrium if for each s ∈ S, every player i ∈ I and any π i ∈ Π i such that (π i , π * −i ) is feasible, γ i (π * )(s) ≥ γ i (π i , π * −i )(s). A Stationary Markov Perfect Equilibrium (SMPE) is a Nash equilibrium which belongs to the class of strategy profiles Φ.

Results
Let V be the space of all nonnegative Borel measurable functions v : S → R such that v(0) = 0. For everyf = (f 1 , . . . , f m ) ∈ Φ and v ∈ V define m distinct backward induction dynamic programming operators associated with the game: We start with a preliminary problem. For a fixed set (v 1 , . . . , v m ) ∈ V m consider a system of equations: (1) Note that it is not immediately clear whether there exists a profilef = (f 1 , . . . , f m ) ∈ Φ which would satisfy system (1) for all s ∈ S. However, an affirmative answer can be obtained if the functions v 1 , . . . , v m are additionally specified. Define spaces V 1 , . . . , V m , where there exists a unique profileφ = (φ 1 , . . . , φ m ) ∈ Φ which solves the corresponding system (1) for all s ∈ S. Moreover, Proof. For all i ∈ I let v i (s) = k i s α i , where k i ∈ (0; +∞). Define w i : D → R for every i ∈ I, such that The motivation behind introducing functions w i in the above formulation is explained by the following equalities: for every i ∈ I, s ∈ S andf = (f 1 , . . . , f m ) ∈ Φ.
Arbitrarily choose i ∈ I and a set of functions (f j ) j∈I(i) such that (0,f −i ) ∈ Φ. Fix s ∈ S and consider a maximization problem The problem (2) has a trivial solution x i = 0 in the case when j∈I(i) f j (s) = s.
Then, for every x i ∈ 0; s− j∈I(i) f j (s) we have that Thus, w i s, (x i ,f −i (s)) is a concave function with respect to x i by being concave in the interior of the domain and continuous.

Notice that there exists a pointx
By the sufficient extremum condition, the point x i is a local maximum of w i s, (x i ,f −i (s)) . Furthermore,x i is a global maximum of w i s, (x i ,f −i (s)) due to concavity of the objective function with respect to x i .
We have that arg max Therefore, a functioñ defined for all s ∈ S, maximizes the operator T i (y i ,f −i ), v i (s) for any s ∈ S and any set of functions (f j ) j∈I(i) such that (0,f −i ) ∈ Φ.
Let q i := c i k i l i 1 1−α i . Clearly, q i ∈ (0; +∞) for all i ∈ I.
In accordance with this notation, the first m equations of system (3) are equivalent to which is a system of linear equations with a unique solutionφ = (φ 1 , . . . , φ m ), where The fact that for every s ∈ S m i=1 φ i (s) = m j=1 q j 1 + m j=1 q j s < s implies thatφ ∈ Φ. Therefore,φ is a unique solution of system (1) for all s ∈ S.
It remains to calculate T i (φ, v i )(s) for every i ∈ I.
Note that We can use equality (7) to rewrite equations (6) and obtain that values q * 1 , . . . , q * m satisfy the following properties for every i ∈ I: which is equivalent to For each i ∈ I introduce values k * i ∈ (0; +∞) as Rewriting (8) in terms of k * i and multiplying both sides of the equation by k * i provides the result which was needed to prove: for all i ∈ I.
We are now ready to formulate the main result.
Proof. For every i ∈ I put v * i (s) := k * i s α i and By Lemmas 1 and 2, we have that for every i ∈ I and any wheref * = (f * 1 , . . . , f * m ). In order to appropriately apply dynamic programming theorems, it remains to check whether lim t→∞ E π s β t−1 i v * i (S t ) equals to zero for every i ∈ I, every initial state s 1 = s ∈ S and every strategy π i ∈ Π i such that π = (π i ,f * −i ) is feasible.
Observe that for any initial state of the game s ∈ S and any feasible strategy profile π ∈ Π 1 × . . . × Π m , the corresponding stochastic process {S t , X t } has the following property: where ξ t is a random realization of the random variable ξ at stage t.
Then, for every i ∈ I, every s ∈ S and every strat- The last equality is due to Assumption A3.

Concluding remarks
Remark 1. As a consequence of Lemma 1, there also exists a unique Markov Perfect equilibrium in the similarly defined finite-horizon game, where at the final stage players split the available resource in the pre-determined relations.
Indeed, if player i's utility at the horizon time T is equal to k i s α i T , where k i ∈ (0; +∞) and s T is the observed state, then Lemma 1 ensures that there is a unique equilibrium solution to every appropriate one-shot subproblem obtained sequentially via the backward induction. Together they constitute, by the optimality principle, a unique (backwardly constructed) Markov Perfect Equilibrium of the game.

Remark 2.
A numerical calculation of the SMPE from Theorem 1 and the corresponding rewards is straightforward. By their construction, it suffices to simply find a point z * , which turns the function g(z) into 1. Since g(z) is continuous, the task can be easily executed with standard numerical methods.